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Abstract 

The  coordination  of  spatially  distributed  systems  of  cooperating  agents,  which 
perform  an  assigned  mission  in  the  presence  of  uncertainty  and  system  faults,  is  an 
important  emerging  technology.  The  actions  and  health  of  these  distributed  systems 
depend  upon  the  information  that  can  be  communicated  and  the  knowledge  of  the  current 
capabilities  of  all  cooperating  agents.  Methodologies  for  the  distribution  of  estimation 
and  redundancy  management  functions  over  the  dynamic  network  of  cooperating  agents 
were  developed,  leading  to  effective  team  strategies 

Progress  has  been  made  on  various  aspects  of  the  distributed  systems  problem. 
From  the  fundamental  level  we  investigated  the  decentralized  control  problem  with 
constrained  communication.  In  parallel  the  allocation  of  transmit  power  in  wireless 
networks  was  a  focus  of  study  into  the  decentralized  control  problem  because  it  has  a 
simple  structure  and  the  information  communicated  is  constrained.  In  the  area  of  health 
monitoring  new  robust  analytical  redundancy  methods  have  been  developed  which 
detects,  identifies,  and  reconstructs  sensor,  actuator  and  plant  faults.  A  robust  multiple- 
v  fault  filter  is  developed  based  on  a  performance  measure  from  which  the  desired 

detection  subspaces  are  approximately  constructed.  This  detection  filter  formulation, 
which  includes  uncertainty,  is  the  bases  for  single-fault  time-varying,  decentralized 
detection  filters,  and  fault  magnitude  reconstruction.  An  innovative  application  of 
distributed  detection  filters  methodology  is  to  the  target  track  association  problem. 
Finally,  the  distributed  estimation  problem  was  addressed  by  considering  elements  of  the 
..  relative  navigation  problem  among  distributed  vehicles.  Exact  statistical  solutions  to  the 
pseudorange  equations  in  GPS  and  an  efficient  nonlinear  filter  based  on  multiple 
hypothesis  sequential  probability  ratio  tests  for  resolving  the  integer  ambiguity  in 
differential  carrier  GPS  were  developed  and  extended. 

Accomplishments 

The  following  accomplishments  in  the  study  of  cooperative  agents  are  divided  into  three 
categories;  decentralized  control,  robust  fault  detection  filters  for  distributed  analytic 
redundancy  management  systems,  and  nonlinear  estimation  applied  to  relative  GPS 
navigation  among  moving  vehicles. 
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1. 


Decentralized  Control 


1.1  A  Stochastic  Decentralized  Control  Problem  with  Noisy  Information 

A  simple  decentralized  stochastic  control  problem  is  considered  where  the  non-classical 
nature  of  the  information  pattern  is  induced  by  the  uncertainty  of  the  information 
transmission  in  the  system  [1,  Appendix  A].  This  is  in  fact  a  reformulation  of  the 
Witsenhausen  counter-example,  where  the  first  station  is  allowed  to  send  its’  information 
to  the  second  station  through  a  noisy  channel.  Non-convexity  of  the  problem  in  this  new 
formulation  has  been  established  and  it  is  shown  how  this  formulation  relates  to  a 
classical  problem  and  the  Witsenhausen  problem,  respectively,  when  the  transmission 
noise  intensity  goes  to  zero  or  infinity.  Assuming  a  small  transmission  noise  intensity,  an 
asymptotic  approach  is  then  used  in  order  to  find  an  approximated  cost.  A  necessary 
condition  for  asymptotically  optimal  strategies  has  been  obtained  using  a  variational 
approach  and  it  is  shown  that  the  linear  strategies,  with  slightly  different  coefficients  than 
the  noiseless  transmission  case,  satisfy  the  necessary  condition. 

1.2  Application  to  Power  Allocation  in  Cellular  Radio  Networks 

A  distributed  Dynamic  Channel  and  Power  Allocation  (DCPA)  scheme  based  on  a  novel 
predictive  power  control  algorithm  is  proposed  [2,  Appendix  B].  Power  control  is 
considered  an  efficient  scheme  to  mitigate  co-channel  and  multiple-access  interference  in 
cellular  radio  systems.  Various  approaches  have  been  proposed  in  recent  years  to  design 
power  control  algorithms.  We  focus  on  the  feedback  algorithms  that  are  based  on  Signal 
to  Interference  plus  Noise  Ratios  (SIR-based  algorithms).  We  review  SIR  threshold 
approaches  and  then  discuss  how  power  control  design  can  be  formulated  as  a 
decentralized  regulation  problem.  We  use  a  robust  control  framework  to  analyze  global 
stability  of  a  network  on  a  single  channel.  We  obtain  a  sufficient  condition,  which 
guarantees  that  the  deviations  of  the  power  levels  form  their  optimal  values  remain 
bounded,  even  when  the  channel  gains  change,  as  long  as  the  network  stays  feasible  [3, 
Appendix  C].  The  Minimum  Interference  Dynamic  Channel  Assignment  algorithm  is 
employed,  while  simple  Kalman  Filters  are  designed  to  provide  the  predicted 
measurements  of  both  the  channel  gains  and  the  interference  levels,  which  are  then  used 
to  update  the  power  levels.  Extensive  computer  simulations  are  carried  out  to  show  the 
improvement  in  performance,  under  the  dynamics  of  user  arrivals  and  departures  and  user 
mobility.  It  is  shown  that  the  number  of  dropped  calls  and  the  number  of  blocked  calls 
are  decreased  while,  on  average,  fewer  channel  reassignments  per  call  are  required  [2, 
Appendix  B]. 

1.3  Periodic  Control 

A  IT  test  is  presented  for  determining  when  a  controller  with  periodic  gains  is  superior  to 
a  LTI  compensator  for  a  class  of  LQ  strong  stabilization  problems  [4,  Appendix  D],  It 
has  been  noted  that  only  strongly  stabilizing  compensators  can  stabilize  a  certain  type  of 
decentralized  system.  For  systems  with  strictly  proper  transfer  functions,  it  is  proven 
that  stable  high  frequency  periodic  controllers  based  on  weak  variations  about  the  LTI 
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case  cannot  give  better  performance  than  stable  LTI  compensators.  In  the  development,  a 
means  to  evaluate  the  second  partials  of  functions  with  respect  to  matrix  valued 
parameters  is  introduced.  These  techniques  can  be  trivially  modified  to  deal  with 
problems  involving  optimizing  decentralized  controllers  for  systems  with  fixed  modes. 


2.  Fault  Detection  and  Distributed  Detection  Filters 

2.1  A  Generalized  Least-Squares  Fault  Detection  Filter 

A  fault  detection  and  identification  algorithm  is  determined  from  a  generalization  of  the 
least-squares  derivation  of  the  Kalman  filter  [5,  Appendix  E].  The  objective  of  the  filter 
is  to  monitor  a  single  fault  called  the  target  fault  and  block  other  faults,  which  are  called 
nuisance  faults.  The  filter  is  derived  from  solving  a  min-max  problem  with  a  generalized 
least-squares  cost  criterion  which  explicitly  makes  the  residual  sensitive  to  the  target 
fault,  but  insensitive  to  the  nuisance  faults.  It  is  shown  that  this  filter  approximates  the 
properties  of  the  classical  least-squares  fault  detection  filter  such  that  in  the  limit  where 
the  weighting  on  the  nuisance  fault  is  zero,  the  generalized  least-squares  fault  detection 
filter  becomes  equivalent  to  the  unknown  input  observer  where  there  exists  a  reduced- 
order  filter.  Filter  designs  can  be  obtained  for  both  linear  time-invariant  and  time- varying 
systems. 


2.2  Robust  Multiple-Fault  Detection  Filter 

A  new  robust  multiple-fault  detection  and  identification  algorithm  is  proposed  [6, 
Appendix  F].  Different  from  other  algorithms  which  explicitly  force  the  geometric 
structure  by  using  eigenstrueture  assignment  or  geometric  theory,  this  algorithm  is 
derived  by  solving  an  optimization  problem.  The  output  error  is  divided  into  several 
subspaces.  For  each  subspace,  the  transmission  from  one  fault,  denoted  the  associated 
target  fault,  is  maximized,  and  the  transmission  from  other  faults,  denoted  the  associated 
nuisance  fault,  is  minimized.  Therefore,  each  projected  residual  of  the  robust  multiple- 
fault  detection  filter  is  affected  primarily  by  one  fault  and  minimally  by  the  other  faults. 
The  transmission  from  process  and  sensor  noise  is  also  minimized  so  that  the  filter  is 
robust  with  respect  to  these  disturbances.  It  is  shown  that  this  filter  approximates  the 
properties  of  the  restricted  diagonal  filter  of  which  the  Beard-Jones  detection  filter  is  a 
special  case.  In  the  limit  where  the  weighting  on  each  associated  nuisance  fault 
transmission  goes  to  infinity,  the  geometric  structure  of  the  restricted  diagonal  detection 
filter  is  recovered.  When  it  is  not  in  the  limit,  the  filter  only  isolates  the  faults  within 
approximate  invariant  subspaces.  This  new  feature  allows  the  filter  to  be  potentially 
more  robust  since  the  filter  structure  is  less  constrained.  Filter  design  can  be  obtained  for 
both  time-  invariant  and  time-varying  linear  systems. 

2.2  Optimal  Stochastic  Fault  Detection  Filter 

A  fault  detection  and  identification  algorithm,  called  optimal  stochastic  fault  detection 
filter,  is  determined  [7,  Appendix  G].  The  objective  of  the  filter  is  to  monitor  a  single 
fault  called  the  target  fault  and  block  other  faults,  which  are  called  the  nuisance  faults  in 


the  presence  of  the  process  and  sensor  noises.  The  filter  is  derived  by  maximizing  the 
transmission  from  the  target  fault  to  the  projected  output  error  while  minimizing  the 
transmission  from  the  nuisance  faults.  Therefore,  the  residual  is  affected  primarily  by  the 
target  fault  and  minimally  by  the  nuisance  faults.  The  transmission  from  the  process  and 
sensor  noises  is  also  minimized  so  that  the  filter  is  robust  with  respect  to  these 
disturbances.  This  filter  is  a  special  case  of  the  detection  filter  of  [6,  Appendix  F].  It  is 
shown  that  this  filter  approximates  the  properties  of  the  classical  fault  detection  filter 
such  that  in  the  limit  where  the  weighting  on  the  nuisance  fault  transmission  goes  to 
infinity,  the  optimal  stochastic  fault  detection  filter  becomes  equivalent  to  the  unknown 
input  observer.  However,  the  nuisance  fault  directions  and  their  associated  invariant  zero 
directions  must  be  included  in  the  invariant  subspace  generated  by  the  optimal  stochastic 
fault  detection  filter.  The  asymptotic  behavior  of  the  filter  as  the  weighting  on  the 
nuisance  fault  transmission  becomes  large  is  determined  by  using  a  perturbation  method 
and  it  is  shown  that  the  geometric  structure  of  the  unknown  input  observer  is  recovered. 
Filter  designs  can  be  obtained  for  both  time-invariant  and  time- varying  systems. 

2.3  Fault  Reconstruction  from  Sensor  and  Actuator  Failures 

An  approach  for  reconstructing  sensor  and  actuator  faults  from  the  residual  is  proposed 
[8,  Appendix  H].  The  transfer  matrix  from  the  faults  to  the  residual  is  derived  in  terms  of 
the  eigenvalues  of  the  fault  detection  filter  associated  with  the  invariant  subspaces  of  the 
fault  and  the  invariant  zeros  of  the  faults.  For  each  fault,  all  possible  fault  reconstruction 
processes  are  derived  parameterized  by  applying  a  projector  to  the  transfer  matrix  and 
taking  its  inverse.  Then,  the  optimal  fault  reconstruction  process  is  determined  by 
minimizing  the  ratio  of  the  H2  norm  of  the  projected  transfer  matrix  from  the  disturbance 
to  the  H2  norm  of  the  projected  transfer  matrix  from  the  fault.  For  the  existence  of  the 
fault  reconstruction  process,  the  invariant  zeros  of  the  fault  have  to  be  in  the  left  half 
plane.  Furthermore,  for  reconstructing  a  sensor  fault,  the  system  has  to  be  detectable 
with  respect  to  the  other  sensors. 

2.4  A  Decentralized  Fault  Detection  Filter 

The  decentralized  fault  detection  filter  has  a  structure  that  results  from  merging 
decentralized  estimation  theory  with  the  game  theoretic  fault  detection  filter  [9,  Appendix 
I].  A  decentralized  approach  may  be  the  ideal  way  to  health  monitor  large-scale  systems, 
since  it  decomposes  the  problem  down  into  (potentially  smaller)  “local”  problems.  These 
local  results  are  then  blended  into  a  “global”  result  that  describes  the  health  of  the  entire 
system.  The  benefits  of  such  an  approach  include  added  fault  tolerance  and  easy 
scalability.  An  example  given  at  the  end  of  the  paper  demonstrates  the  use  of  this  filter 
for  a  platoon  of  cars  proposed  for  an  advanced  vehicle  control  system. 

2.5  Application  of  Detection  Methods  to  Target  Association 

A  residual-based  scheme  for  solving  the  radar  track  association  problem  using  bearings- 
°nly  measurements  is  developed  [10,  Appendix  J].  To  accomplish  track  association 
between  two  stations,  we  analyze  the  residuals  of  a  bank  of  nonlinear  filters  called 
modified  gain  extended  Kalman  filters  (MGEKFs).  Once  tracks  have  been  associated 
between  two  stations,  tracks  from  additional  stations  may  be  associated  with  tracks  from 
the  first  two  stations  by  checking  algebraic  parity  equations.  Traditional  track  association 
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methods  rely  on  the  local  stations’  estimated  target  positions  and  error  variances,  which 
may  be  quite  inaccurate  when  using  bearings-only  measurements.  Our  method  bypasses 
this  difficulty,  since  our  filters  use  raw  data  from  two  stations.  An  example  illustrates  the 
effectiveness  of  our  methods. 

3  Nonlinear  Estimation  Applied  to  Relative  GPS  Navigation 

3.1  Exact  Statistical  Solution  of  Pseudorange  Equations 

Although  the  exact  GPS  solution  proposed  by  Bancroft  is  nonlinear,  it  may  be 
manipulated  into  a  linear  form  when  5  or  more  satellites  are  visible  [11,  Appendix  K]. 
This  linear  form  is  exact,  as  opposed  to  the  linear  solution  obtained  via  repeated 
linearization  in  the  iterated  least  squares  (ILS)  method.  By  virtue  of  this  exactness,  the 
solution  of  the  linear  form  is  always  the  true  user  position,  while  the  ILS  may  converge  to 
an  incorrect  solution  (this  is  especially  common  when  the  GPS  user  is  in  space). 

When  the  measured  pseudoranges  are  noisy,  the  linear  structure  ensures  that  the  position 
estimate  will  converge  to  the  correct  value  and  that  the  error  covariance  of  the  estimate  is 
known,  guarantees  that  have  not  been  found  for  nonlinear  estimators  that  use  the  Bancroft 
solution  directly.  The  conversion  to  the  linear  form  excludes  information  present  in  a 
single  scalar  nonlinear  measurement  equation.  We  demonstrate  several  procedures  for 
refining  the  linear  estimate  with  this  remaining  information.  In  addition,  we  show  that 
the  methodology  developed  for  direct  GPS  solutions  can  be  applied  to  create  linear  direct 
methods  for  differential  GPS  problems. 

3.2  Multiple  Hypothesis  Sequential  Probability  Ratio  Tests  for  Resolving  Integer 
Ambiguity  in  GPS 

Two  statistical  techniques  appropriate  for  the  "validation"  of  integer  ambiguities  and  the 
detection  of  cycle  slips  are  developed  [12,  appendix  L],  The  multiple  hypothesis  Wald 
sequential  probability  ratio  test  (SPRT)  can  find  the  conditional  probability  that  each  set 
v  of  integer  biases  under  consideration  is  the  true  bias  condition.  The  multiple  hypothesis 

Shiryayev  SPRT  determines  the  conditional  probability  that  the  integer  biases  have 
jumped  from  the  nominal  bias  condition  to  each  member  of  a  collection  of  other  bias 
conditions.  Hence,  the  Wald  SPRT  is  a  method  for  validating  the  integer  ambiguities 
during  the  initial  ambiguity  resolution,  while  the  Shiryayev  SPRT  can  be  used  to  monitor 
for  cycle  slips.  Each  of  these  multiple  hypothesis  SPRTs  (MHSPRTs)  makes  use  of  two 
measurement  residuals.  One  is  geometric  combination  of  the  carrier  phase 
measurements,  and  the  other  is  generated  by  differencing  the  carrier  phase  measurements 
with  code  measurements. 

Prior  work  on  cycle  slip  monitoring  has  focused  solely  on  the  detection  of  the  occurrence 
of  a  cycle  slip  in  the  fastest  time,  balanced  against  the  probability  of  issuing  a  false  alarm. 
Once  a  disruption  has  occurred,  the  ambiguity  resolution  process  must  restart  from 
scratch.  The  Shiryayev  SPRT  bypasses  this  problem,  as  it  announces  the  location  of  the 
biases  after  the  jump,  in  addition  to  the  time  of  the  cycle  slip.  The  calculations  for  the 
MHSPRTs  are  not  linked  to  any  particular  distribution,  unlike  prior  efforts.  Only  the 
probability  density  functions  of  the  measurement  residuals  are  required.  Hence,  the 
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techniques  can  correctly  compensate  for  non-Gaussian  errors  in  measurement  such  as 
multipath.  For  each  hypothesis  under  consideration,  the  MHSPRTs  yield  the  probability 
of  that  hypothesis  being  the  correct  one.  The  "state"  of  the  MHSPRT  recursions  is  the 
vector  of  all  of  these  probabilities.  Information  from  past  measurements  is  embedded  in 
this  state.  This  recursive,  probabilistic  framework  makes  it  very  straightforward  to  add 
new  hypotheses  into  the  set  of  possible  bias  conditions  while  retaining  information  from 
prior  measurements.  In  contrast,  there  is  no  way  to  do  this  for  other  techniques,  since 
they  are  based  on  cumulative  sums.  Results  from  successful  simulations  and  field 
experiments  show  the  efficacy  of  these  techniques. 

Publications 

[1]  Kambiz  Shoarinejad,  Jason  L.  Speyer,  and  Ioannis  Kanellakopoulos,  “A  Stochastic 
Decentralized  Control  Problem  with  Noisy  Communication,”  SIAM  Journal  of  Control 
Optim.,  Vol.  41,  No. 3,  2002,  pp.  975-990. 

[2]  Kambiz  Shoarinejad,  Jason  L.  Speyer,  and  Gregory  J.  Pottie,  “A  distributed  scheme 
for  integrated  predictive  dynamic  channel  and  power  allocation  in  cellular  radio 
networks”  Proceedings  of  the  IEEE  Globecom  Conference  2001  and  to  be  publish  in  the 
IEEE  Transactions  on  Wireless  Communication  ( IEEE  ToWC). 

[3]  Kambiz  Shoarinejad,  Jason  L.  Speyer,  Fernando  Paganini,  and  Gregory  J.  Pottie, 
“Global  Stability  of  Feedback  Power  Control  Algorithms  for  Cellular  Radio  Networks” 
Proceedings  of  the  IEEE  CDC’01. 

[4]  Jonathan  D.  Wolfe  and  Jason  L.  Speyer,  “The  periodic  optimality  of  LQ  controllers 
satisfying  strong  stabilization,”  Proceedings  of  the  IF  AC  workshop  on  periodic  control, 
August,  2001  and  to  be  publish  in  Automatica. 

[5]  Robert  H.  Chen  and  Jason  L.  Speyer,  “A  generalized  least-squares  fault  detection 
filter,”  International  Journal  of  Adaptive  Control  and  Signal  Processing,  vol.  14,  pp. 
747-757, 2000 

[6]  Robert  H.  Chen  and  Jason  L.  Speyer,  “Robust  Multiple-Fault  Detection  Filter,”  The 
special  issue  of  condition  monitoring,  fault  detection  and  isolation  in  the  International 
Journal  of  Robust  and  Nonlinear  Control,  Vol.  12,  Issue  8, 2002. 

[7]  Robert  H.  Chen,  D.  Lewis  Mingori  and  Jason  L.  Speyer,  “Optimal  Stochastic  Fault 
Detection  Filter,”  Automatica,  vol.  39  (2003)  377-390. 

[8]  Robert  H.  Chen  and  Jason  L.  Speyer,  “Fault  Reconstruction  from  Sensor  and  Actuator 
Failures,”  Proceedings  of  the  IEEE  Conference  on  Decision  and  Control,  December, 
2001 


6 


[9]  Walter  H.  Chung,  Jason  L.  Speyer  and  Robert  H.  Chen,  “A  Decentralized  Fault 
Detection  Filter,”  ASME  J.  of  Dynamic  Systems,  Measurement,  Control,  Vol.  123,2001, 

[10]  Jonathan  D.  Wolfe  and  Jason  L.  Speyer,  “Target  Association  Using  Detection 
Methods,”  AIAA  J.  Guidance,  Control,  and  Dynamics,  Vol.  25,  No.  6,  November- 
December,  2002. 

[11]  Jonathan  D.  Wolfe  and  Jason  L.  Speyer,  “Exact  Statistical  Solution  of  Pseudorange 
Equations”  Proceedings  of  the  ION  GPS  2001  and  to  be  published  in  The  Journal  of  the 
Institute  of  Navigation 

[12]  Jonathan  D.  Wolfe,  Jason  L.  Speyer,  and  Walton  R.  Williamson,  “Multiple 
Hypothesis  Sequential  Probability  Ratio  Tests  for  Resolving  Integer  Ambiguity  in  GPS,” 
Proceedings  of  the  ION  GPS  2001  and  to  be  published  in  The  Journal  of  the  Institute  of 
Navigation 


Personnel  Supported 

Graduate  Students:  Robert  Chen,  Jonathan  Wolfe,  Kambiz  Shoarinejad,  and  Charles 
Dillon,  Frederico  Najson. 

Interactions 

Air  Vehicle  Division, WPAFB,  Contact:Dr.  Siva  Banda,  Munitions  Division,  Eglin  AFB, 
Contact  Johny  Evers  and  Rob  Murphy 

Transitions 

Nasa  Dryden  Flight  Research  Center,  Integer  ambiguity  resolution,  Contact:  Gerard 
Schkolnik(66 1 -258-368 1 ) 

Advisory  Function 

Member  of  the  Air  Force  Scientific  Advisory  Board 
Honors/Awards 

Fellow  of  the  IEEE  and  AIAA.  Recipient  of  the  IEEE  Third  Millennium  Medal 
Department  of  the  Air  Force  Award  for  Meritorious  Civilian  Service,  2001 
NASA  Public  Service  Group  Achievement  Award  awarded  to  the  UCLA  Autonomous 
Vehicles  System  Instrumentation  Laboratory 


7 


Appendix  A 


“A  Stochastic  Decentralized  Control  Problem  with  Noisy 

Communication,” 


Kambiz  Shoarinejad,  Jason  L.  Speyer,  and  Ioannis  Kanellakopoulos, 
SIAM  Journal  of  Control  Optim.,  Vol.  41,  No.3, 2002,  pp.  975-990. 


SIAM  J .  Control  Optim, 
Vol.  41,  Ho.  3,  pp.  975-990 


(c)  2002  Society  for  Industrial  and  Applied  Mathematics 


A  STOCHASTIC  DECENTRALIZED  CONTROL  PROBLEM  WITH 
NOISY  COMMUNICATION* 

KAMBIZ  SHOARINEJAD^  JASON  L.  SPEYER*,  AND  IOANNIS  K ANELLAKOP OULOS§ 

Abstract.  A  simple  decentralized  stochastic  control  problem  is  considered  where  the  nonclassical 
nature  of  the  information  pattern  is  Induced  by  the  uncertainty  on  the  information  transmission  in 
the  system.  This  is,  in  fact,  a  reformulation  of  the  Witsenhausen  counterexample,  where  the  first 
station  Is  allowed  to  send  its  information  to  the  second  station  through  a  noisy  channel.  Nonconvexity 
of  the  problem  in  this  new  formulation  has  been  established,  and  it  is  shown  how  this  formulation 
relates  to  a  classical  problem  and  the  Witsenhausen  problem,  respectively,  when  the  transmission 
noise  intensity  goes  to  zero  or  infinity.  Assuming  small  transmission  noise  intensity,  we  then  use  an 
asymptotic  approach  in  order  to  find  an  approximated  cost.  A  necessary  condition  for  asymptotically 
optimal  strategies  has  been  obtained  using  a  variational  approach,  and  it  is  shown  that  the  linear 
strategies,  with  slightly  different  coefficients  them  the  noiseless  transmission  case,  satisfy  the  necessary 
condition. 

Key  words,  optimal  stochastic  control,  decentralized  systems,  asymptotic  analysis 
AMS  subject  classifications.  93E20,  93A14 
PII.  30363012901385629 

1.  Introduction.  Coordinating  and  controlling  dynamic  systems  in  spatial  net- 
works  has  always  been  a  challenging  problem  for  system  designers.  It  is  now  attracting 
more  attention  as  various  new  applications  are  emerging  in  a  very  wide  range  from  au¬ 
tonomous  vehicles  in  formation  to  flow  and  congestion  control  in  computer  networks. 
However,  there  are  still  some  major  difficulties  in  dealing  with  such  systems.  The 
main  characteristics  of  any  decentralized  system  is  that  the  information  is  distributed 
among  different  stations  and  the  performance  of  the  system  depends  highly  on  the 
corresponding  information  pattern,  i.e.,  who  knows  what  and  when .  The  stations  may 
communicate  with  each  other  possibly  by  signaling  through  noisy  channels.  Even 
though  there  might  be  some  physical  constraints  on  the  information  structure  of  the 
system  (e.g.,  locations  of  the  sensors,  the  actuators,  and  the  transmitters),  in  general, 
an  optimal  information  pattern  should  be  obtained.  Then,  based  on  the  locally  avail¬ 
able  information,  a  set  of  coordinated  local  strategies  should  be  designed  in  order  to 
achieve  a  common  objective.  In  many  cases,  however,  we  will  end  up  with  nonconvex 
functional  optimization  problems,  which  are  usually  very  difficult  to  solve. 

One  such  class  of  problems  is  when  a  decentralized  system  has  a  nonclassical 
information  pattern  which  is  not  partially  nested.  The  information  pattern  is  called 
nonclassical  when  the  distributed  stations  do  not  have  access  to  the  same  information 
and/or  some  stations  do  not  have  perfect  recall  (i.e.,  they  lose  information).  Moreover, 
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a  nonclassical  information  pattern  is  not  partially  nested  when  some  stations  cannot 
reconstruct  the  previous  actions  of  other  stations  which  have  affected  their  own  local 
information.  Unfortunately,  this  happens  in  many  decentralized  systems. 

In  1968,  Witsenhausen  provided  a  simple  example  in  [1]  in  which  there  are  only 
two  stations,  the  dynamics  are  linear,  the  underlying  uncertainties  are  additive  and 
Gaussian,  and  the  cost  is  quadratic.  The  information  pattern,  however,  is  nonclassical. 
This  example  motivated  much  research  on  the  links  between  decentralized  stochastic 
control  problems  and  team  theory  and  the  effects  of  different  information  patterns  on 
decentralized  systems.  Although  it  is  a  very  simple  example,  it  demonstrates  the  main 
difficulties  induced  by  nonclassical  information  patterns.  In  this  example,  one  station 
acts  first  and  affects  the  information  available  to  the  next  station,  while  there  is  no 
way  for  the  second  station  to  determine  the  action  of  the  first  station.  The  existence 
of  the  optimal  design  was  established  in  [1],  where  a  nonlinear  set  of  strategies  was 
also  proposed  which  showed  that  no  affine  strategy  could  be  optimal. 

This  seemingly  simple  example,  which  is  also  called  Witsenhausen’s  counterexam¬ 
ple,  turned  out  to  be  extremely  hard.  It  is  still  outstanding  after  more  than  30  years. 
It  was  later  shown  in  [2]  that  when  the  uncertainty  on  the  information  available  to 
the  first  station  is  small,  linear  strategies  would  still  be  optimal  over  a  large  class  of 
nonlinear  strategies.  Intuitively,  when  the  uncertainly  on  the  information  of  the  first 
station  is  small,  the  second  station  will  also  be  able  to  guess  what  that  information 
was.  Therefore,  since  the  problem  is  cooperative  in  the  sense  that  the  stations  are 
aware  of  each  others’  strategies,  the  second  station  can  almost  reconstruct  the  action 
of  the  first  station,  and  there  is  no  need  for  any  kind  of  signaling  among  the  stations 
through  the  dynamics  of  the  system.  In  Witsenhausen’s  problem,  the  nonclassical 
nature  of  the  information  pattern  is  a  result  of  the  fact  that  the  information  available 
to  the  first  station  is  completely  inaccessible  for  the  second  station.  However,  recent 
advances  in  computing  and  communication  technologies  make  it  possible  for  the  sta¬ 
tions  in  many  decentralized  systems  to  communicate  different  pieces  of  information. 
But  communications  can  never  be  perfect,  and  there  is  always  some  uncertainty  in¬ 
volved,  Unfortunately,  such  uncertainty  will  again  induce  a  nonclassical  nature  on 
the  information  pattern  of  the  system. 

In  this  paper,  we  reformulate  Witsenhausen’s  problem  by  allowing  the  first  station 
to  communicate  its  information  with  the  second  station  through  a  noisy  channel.  Then 
we  show  that  as  long  as  there  is  noise  in  the  transmission,  the  main  difficulties  will 
persist.  Specifically,  the  cost  might  still  be  nonconvex  with  respect  to  the  strategies. 
We  then  consider  the  two  limit  cases  where  the  transmission  uncertainty  becomes 
either  very  large  or  negligible.  We  show  how  this  new  formulation  covers  a  wide 
range  of  problems,  from  the  classical  linear  quadratic  Gaussian  (LQG)  problem  to 
the  Witsenhausen  counterexample. 

When  the  transmission  noise  intensity  is  small,  one  would  expect  the  optimal 
strategies  to  be  very  close  to  the  corresponding  strategies  for  the  noiseless  transmission 
case.  Our  next  objective  in  this  paper  is  to  investigate  this  case  through  an  asymptotic 
analysis. 

In  section  2,  we  present  the  problem  formulation.  In  section  3,  we  obtain  an  alter¬ 
native  form  for  the  performance  index,  which  clearly  shows  the  possible  nonconvexity 
of  the  cost  with  respect  to  the  strategies.  In  section  4,  we  consider  the  two  limit 
cases,  i.e.,  when  the  transmission  noise  intensity  goes  to  zero  or  infinity.  In  section 
5  we  assume  a  small  uncertainty  on  the  transmission  and  approximate  the  cost  by 
expanding  it  in  terms  of  the  small  transmission  noise  intensity.  In  section  6,  we  use 
a  variational  approach  in  order  to  find  a  necessary  condition  for  the  strategies  that 
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minimize  the  approximated  cost.  As  we  shall  see,  we  will  actually  have  a  singular 
optimization  problem.  We  then  show  that  the  asymptotically  optimal  strategies  can 
still  be  linear  with  slightly  different  coefficients  than  the  corresponding  strategies  for 
the  noiseless  transmission  case.  We  provide  concluding  remarks  in  the  final  section. 

2.  Problem  description.  Consider  a  two-stage  stochastic  problem  with  the 
following  state  equations: 

(2.1)  Xi=xQ  +  uXi 

(2.2)  X2  ~  Xi  ~~  «2, 

where  xq  is  the  initial  state,  which  is  assumed  to  be  a  zero  mean  Gaussian  random 
variable  with  variance  erg.  The  information  pattern  of  the  system  is  specified  by  the 
following  output  equations: 


where  V2  is  the  measurement  noise  for  the  second  station,  which  is  also  assumed  to  be 
a  zero  mean  Gaussian  random  variable  with  unit  variance.  As  we  can  see,  the  infor¬ 
mation  available  to  the  first  station  is  being  transmitted  to  the  second  station,  and  the 
communication  uncertainty  is  modeled  by  an  additive  Gaussian  noise  vt  ~  M  (0,  e2) . 
Also,  Xq,  V2j  and  vt  are  all  assumed  to  be  independent  of  each  other.  It  is  clear  that 
we  have  simply  modeled  the  received  information  signal  as  the  transmitted  signal 
plus  the  Gaussian  transmission  noise.  While  this  model  can  be  quite  realistic  for  ana¬ 
log  communication  systems,  it  may  not  be  well  justified  when  digital  communication 
is  used.  In  digital  communication  systems  the  signal  is  quantized,  coded,  and  sent 
through  the  channel.  Still,  the  channel  noise  may  realistically  be  assumed  to  be  addi¬ 
tive  and  Gaussian,  but  sophisticated  modulation  and  coding  schemes  make  it  difficult 
to  assume  a  simple  additive  Gaussian  uncertainty  for  the  received  information  signal 
However,  if  we  try  to  incorporate  the  quantization  effects  along  with  the  bit  error 
probability  distribution  for  some  good  coding  and  modulation  schemes  in  order  to 
model  the  communication  uncertainties,  we  will  end  up  with  models  which  could  still 
be  approximated,  to  some  degree,  by  simple  additive  Gaussian  models.  Moreover, 
since  there  are  already  major  difficulties  in  dealing  with  decentralized  nonclassical 
information  patterns,  using  more  complex  models  for  communication  uncertainties 
may  not  seem  very  reasonable  at  this  point.  Furthermore,  we  believe  that  the  results 
obtained  under  such  a  simplifying  assumption  would  still  serve  as  a  guideline  for  find¬ 
ing  the  true  nature  of  the  optimal  decentralized  strategies.  The  objective  is  now  to 
design  the  control  strategies  71  and  72, 


(2,5)  ui=ji(z1), 

(2-6)  u2  =  72  (22) , 


in  order  to  minimize  the  cost  function 


(2,7)  J  =  E  [k2u\  +  x\) , 

where  k2  >  0  is  a  given  constant.  Note  that  this  is  a  sequential  stochastic  control 
problem  in  the  sense  that  the  second  station  acts  after  the  first  station.  In  other  words, 
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-S-SlAu  -ru  -GtiDlClV-'CiDJ-'DlClV-'CJ 

+  lAlt-rll-G1(DlClV-1C2D2)-lDlClV~1C1fS 

+  Sl-NlQlNj  +  Gl(DlCr2V~iC2D2rlGllS-CTHTV~1HCl  (23) 

By  substituting  (21)  and  (22)  into  (20a),  the  reduced-order  limiting  generalized  least-squares  fault 
detection  filter  is 

??,  =(All-rn)rjl+Mlu  +  lGl(Dr2Cr2V-lC2D2rtDlCT2V~1 +  S-lCr1HTV~lH'](y-Clnl) 

(24) 

Note  that  ru  can  be  computed  a  priori.  In  the  limit,  the  residual  (3)  becomes 

r^Hiy-CrfJ  (25) 

because  fiC2  =  0  from  (21)  and  Ker  H  =  Ker  H. 


7.  EXAMPLE 

In  this  section,  two  numerical  examples  are  used  to  demonstrate  the  performance  of  the 
generalized  least-squares  fault  detection  filter.  In  Section  JA,  the  filter  is  applied  to  a  time 
-invariant  system.  In  Section  7.2,  the  filter  is  applied  to  a  time-varying  system. 

7.i.  Example  1 

In  this  section,  two  cases  for  a  time-invariant  problem  are  presented.  The  first  one  shows  that  the 
sensitivity  of  the  filter  (8)  to  the  nuisance  fault  decreases  when  y  is  smaller.  The  second  one  shows 
that  the  sensitivity  of  the  reduced-order  limiting  filter  (24)  to  the  target  fault  increases  when  Ql  is 
larger.  The  system  matrices  are 


'0  3  4  ' 

'o' 

'5" 

1  2  3 

,  c  = 

0  1  O' 
_0  0  1_ 

.  Ft  = 

0 

,  = 

1 

.0  2  5  _ 

.1. 

.1. 

In  the  first  case,  the  steady-state  solutions  to  the  Riccati  equation  (9)  are  obtained  with 
weightings  chosen  as  ^  =  1,  Qi  —  1,  and  V  =  I  when  y  =  10  ~4  and  10  respectively.  The  top 
two  figures  of  Figure  1  show?  the  frequency  response  from  both  faults  to  the  residual  (3).  The  left 
one  is  y  -  10  4  and  the  right  one  is  y  =  10“6.  The  solid  lines  represent  the  target  fault,  and  the 
dashed  lines  represent  the  nuisance  fault.  This  example  shows  that  the  nuisance  fault  transmission 
can  be  reduced  by  using  a  smaller  y  while  the  target  fault  transmission  is  not  affected. 

In  the  second  case,  the  steady-state  solutions  to  the  reduced- order  limiting  Riccati  equation 
(23)  are  obtained  with  V  —  10  4J  when  Qt  =  0  and  0.0019,  respectively.  The  lower  two  figures  of 
Figure  1  show  the  frequency  response  from  the  target  fault  and  sensor  noise  to  the  residual  (25). 
The  left  one  is  Q1  =  0,  and  the  right  one  is  Qx  =  0.0019.  The  solid  lines  represent 
the  target  fault,  and  the  dashed  lines  represent  the  sensor  noise.  This  example  shows  that  the 
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Figure  1,  Frequency  response  of  the  residual. 


sensitivity  of  the  filter  to  the  target  fault  can  be  enhanced  by  using  a  larger  Q,.  The  sensor  noise 

t“°fn  fS°’nCreaSes  because  Part  of  the  sensor  noise  comes  through  the  same  direction  as 
he  target  fault.  However,  the  sensor  noise  transmission  is  small  compared  to  the  target  fault 
transmission.  In  this  case,  the  nuisance  fault  transmission  stays  zero  and  is  not  shown  in  these 

Reference  ^Tl  ^  V  ^  =  °’  ?e  generalized  least-squares  fault  detection  filter  is  similar  to 

Reference  [2]  which  does  not  enhance  the  target  fault  transmission. 

7.2 .  Example  2 

In  th‘S  section,  the  filter  (8)  and  the  reduced-order  limiting  filter  (24)  are  applied  to  a  time-varying 
ystem  which  is  from  modifying  the  time-invariant  system  in  the  previous  section  by  adding  some 
me-varying  elements  to  A  and  F2  matrices  while  €  and  Ft  matrices  are  the  same: 


cos  t  3 -f- 2  sin  t  4 

5  —  2  cos  t 

A  = 

1  2  3  —  2  cos  t 

,  f2  = 

1 

_  5  si nt  2  5  +  3  cos  £_ 

1  +  sin  t 

The  Riccati  equation  (9)  is  solved  with  Q,  =  1,  Q2  =  1,  y  =  I  and  y  =  1CT5  for  t  e  [0,  25]  The 
t?meCr!t  r  TT*  1 (23)  S°lved  with  the  same  2.  and  V,  Figure  2  shois  the 

fault  resnecrivelv  Thn^t  ^  WhCD  ^  iS  n°  fault>  a  target  fauIt  and  a  nuisance 

fault,  respectively  The  faults  are  unit  steps  that  occur  at  the  fifth  second.  In  each  case,  there  is  no 

aulUransmkl  n  h  show  the  residua]  <3) the  filter  (8).  There  is  a  small  nuisance 

fault  transmission  because  (8)  is  an  approximate  unknown  input  observer.  The  rieht  three  figures 

tranltfen-dUaI  ^  ^  “ng  **  M  Note  that 

systems  Zer°'  CXamP  WS  that  b°th  filterS’ (8)  Md  (24)’ Work  wel1  for  time-varying 
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Not  in  (he  limii  In  the  limit 

Figure  2.  Time  response  of  the  residual. 


8.  CONCLUSION 

The  generalized  least-squares  fault  detection  filter  is  derived  from  solving  a  min-max  problem 
which  makes  the  residual  sensitive  to  the  target  fault,  but  insensitive  to  the  nuisance  faults.  In  the 
limit  where  the  weighting  on  the  nuisance  faults  is  zero,  the  filter  becomes  equivalent  to  the 
unknown  input  observer  which  places  the  nuisance  faults  into  a  minimal  ( C ,  A)-unobservabiIity 
subspace  and  there  exists  a  reduced-order  filter.  Since  the  target  fault  is  explicit  in  the  problem 
formulation,  the  sensitivity  of  the  filter  to  the  target  fault  can  be  enhanced.  Filter  designs  can  be 
obtained  for  both  linear-time-invariant  and  time-varying  systems. 
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the  order  in  which  the  stations  apply  their  control  actions  does  not  depend  on  the 
uncertainties  in  the  system.  We  see  that  the  first  controller  has  perfect  information 
but  its  action  is  costly.  In  contrast,  the  second  controller  has  inexpensive  control  but 
noisy  information.  Since  the  second  station  does  not  know  what  the  first  station  knew, 
due  to  the  transmission  noise,  we  do  not  have  perfect  recall,  and  hence  we  still  have 
a  nonclassical  pattern.  If  there  was  no  transmission  noise,  we  would  have  a  classical 
information  pattern  for  which  the  unique  optimal  strategies  are  known  to  be  linear  in 
the  information. 

3.  An  alternative  form  for  the  performance  index.  In  this  section,  we 
show  how  the  performance  index  may  be  expressed  in  terms  of  the  Fisher  information 
matrix,  which  indicates  that  the  cost  may  not  be  convex  in  the  strategies. 

For  simplicity,  and  similarly  to  the  Witsenhausen  problem,  we  define 

(31)  /  (zi)  :=  zi  +  7i  (zx)  =x0  +  u1: 

(3-2)  9  {z2)  ■—  72  (22)  =  u2 . 

Then  the  cost  can  be  expressed  as 

J  =  E  [k2uf  -f  x |] 

=  E  [&2  (*x  -  /  (Zl)f  +  (/  (Zl)  -  g  {z2)f] 

(3.3)  :=J(f,g). 

If  we  fix  the  function  /,  the  optimal  strategy  g  will  clearly  be  obtained  as  the  condi¬ 
tional  expectation,  Le., 

(3-4)  g*  (z2)  =  arg  mm  J  {/,  g)  =  E[f  (zi)  |z2  ] . 

Substituting  the  above  equation  back  in  the  cost,  we  get 

J*(f)  ■■=  J(f,g*) 

=  k*E  [(zx  -  /  (zi))2]  +  E  [(f  (zi)  -  g*  (z2))2] 

(3-5)  =  k2E  [(zx  -  /  (zx))2]  +  E  [(/  (zx))2]  -  E  [(s*  (z2))2] , 

where  we  have  used  the  orthogonality  property  of  the  conditional  expectation 
(3-6)  E[(f(z1)-g*(z2))g*(z2)}^0. 

It  is  important  to  note  the  minus  sign  in  the  third  term  in  (3.5).  As  we  shall  see, 
this  minus  sign  could  indeed  destroy  the  convexity  of  the  cost  with  respect  to  the 
strategies. 

The  objective  is  now  to  express  the  cost  J*(f )  in  terms  of  only  one  strategy  /. 
In  doing  so,  we  use  the  following  lemma,  which  shows  how  g*  (z2)  may  be  expressed 
in  terms  of  information  z2  and  its  probability  density  function. 

Lemma  3.1.  The  optimal  strategy  g*  (z2)  can  be  expressed  as 

(3*7)  9*  (#2)  =  222  +  In  p(z2) , 

where  p  (z2)  —  p  (z2i,z22)  is  the  probability  density  function  for  the  information  avail¬ 
able  to  the  second  station . 
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Proof ;  We  have 


(22)  =  J  f{zx)p{zi  I Z2)( 


(3  g)  _  //(zi)P  (21. 22)^1 

fp(z1,z2)dz1 

where  p  {z\  ,  Z2)  Is  the  joint  probability  density  of  z\  and  At  the  same  time,  one 
can  write 


(3.9)  /  (zi)p(zuz2)  =  z22p(zi,z2)  +  p(zi,z2). 

This  can  be  shown  as 

d  0 

Z22P  (zi ,  z2)  +  T^p  (zx ,  Z2 )  =  Z22P  (zi ,  Z2)  +  gj-p  ( z2  \zx)p{zi) 


dzo2P(vt’V2) 


=  Z22p(z1,Z2)  + 


=  Z22P  (21,  Z2) + l  —  exp 


(221  -Zxf  (z22-f(z1)Y 


=  /(2l)p(zi,Z2), 


where  we  have  used  the  specific  form  of  the  information  available  to  the  second  station 
and  the  fact  that  vt  ~  AT  (0,  e2)  and  V2  ~  A/*  (0,  1)  are  independent.  By  substituting 
for  /(#i)  12(21,22)  from  (3.9)  back  in  (3.8)  and  integrating  with  respect  to  z\y  the 
expression  in  (3.7)  is  obtained.  □ 

As  we  shall  see,  when  we  try  to  express  the  performance  index  in  terms  of  only  a 
single  strategy  /,  a  Fisher  information  term  comes  up  in  the  cost.  Fisher  information 
is  originally  obtained  In  the  Cramer-Rao  bound,  which  is  a  measure  for  the  minimum 
error  in  estimating  a  parameter  based  on  the  value  of  a  random  variable.  However,  by 
introducing  a  location  parameter,  an  alternative  form  of  the  Fisher  information  may 
be  defined  for  a  random  variable  with  a  given  distribution.  This  alternative  form  is, 
in  fact,  related  to  the  entropy  measure  (see  [3,  p.  494]).  We  first  present  the  definition 
for  the  Fisher  information  matrix. 

Definition  3.2.  The  Fisher  information  matrix  for  a  random  vector  Z  is  de¬ 
fined  as 

(3.11)  If  (Z)  :=  E  [Vj  In p(z)  ■  V.  lnp <*)] , 

where  p(z)  is  the  probability  density  function  for  the  random  variable  Z  and  Vz  de¬ 
notes  the  gradient  vector  with  respect  to  z: 


\_d__  jr 

z [dZl  "  '  dzny 


where  Zi  is  the  ith  component  in  the  random  vector . 

We  are  now  ready  to  present  the  alternative  expression  for  the  performance  index. 
Theorem  3.3.  The  performance  index  (3.5)  can  be  written  as 

(3.13)  J*(f)  =  k2E  [(Zl  -  f  (2l))2]  +  1  -If  (Z2)22, 
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where  If  (^2)22  ^  fac ^  (2, 2)  element  of  the  Fisher  information  matrix  for  the 

random  vector  Z2.  The  subscript  f  indicates  the  fact  that  it  actually  depends  on  the 
form  of  the  strategy  f}  which  is  present  in  the  definition  of  z2  and  would  affect  its 
probability  density  function . 

Proof  Using  (3.7),  we  first  obtain  E[(g*{z2))%  We  have 
(3-14)  E[zl2]  =  E[(f(z *))*]+ 1, 

and 

r  d  If  f+°°  d 

(3.15)  E  Z22g^lnp(z2)^  =  JJ  Z22-7^ln(p(z2l,Z22))p(z2UZ22)dz2idz22' 

If  we  integrate  by  parts  with  respect  to  222 »  we  get 

Q  r+00 

Z22'ffr—ln(p(z2UZ22))p(z2UZ22)dz22=  Z22p  (z2l ,  222)|t^J  ~  I  P  (%2U  Z22)  dz22 

UZ22  J -00 

(316)  =  -p(^2i), 

where  z22  is  assumed  to  have  a  finite  mean  value,  and  therefore  the  first  term  becomes 
zero.  Hence, 

(3.17)  E  Z22^-lnp(z2)  =  -l. 

Therefore, 

(3.18)  E  [(<?*  (z2))2]  =  -1  +  E  [(/  (z,))2]  4-  If  (Z2)22, 
where 

(3-19)  If{Z2)22  =  E  (JL\ np(z2))2  . 

Substituting  (318)  back  in  (3.5),  we  get  (313)  as  an  alternative  form  for  representing 
the  performance  index.  0 

As  we  see,  the  cost  is  now  expressed  only  in  terms  of  one  strategy  /.  Also,  this 
somehow  shows  us  that  in  order  to  minimize  the  cost,  we  need  to  get  the  lowest  possible 
cost  associated  with  the  first  station,  while  we  transfer  as  much  information  as  possible 
to  the  second  station  through  the  dynamics  of  the  system.  The  possible  nonconvexity 
of  the  cost  with  respect  to  /  can  also  be  seen  from  this  alternative  expression.  It 
can  be  shown  that  the  Fisher  information  term  is  a  convex  functional  [4].  Therefore, 
1  —  1/  (^2)22  ls  concave  and  the  sum  of  a  convex  and  a  concave  functional  may  not 
be  convex. 

4,  Limit  cases.  In  this  section  we  consider  the  two  limit  cases.  First  we  consider 
the  case  where  the  transmission  is  noiseless,  and  then  we  investigate  the  case  where 
the  transmission  noise  intensity  goes  to  infinity. 
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4.1*  Noiseless  transmission.  Assume  there  is  no  uncertainty  in  transmitting 
information  from  the  first  to  the  second  station,  Le.,  e  =  0  and  hence  z2i  =  z\.  In  this 
case,  we  have  perfect  recall  and  the  information  pattern  is  classical.  We  can  write 

P(Z2)=P(Z2UZ22)=P(Z22  Nl)p(^l) 

(4.1)  =  p{z22\z1)p{z1)  =  ^jL=exp  llL^p(2l). 

Then,  from  (3.7),  we  have 

(4-2)  g*  (z2)  =  f(zi)  =  f(z21), 

which  could  directly  be  obtained  from  the  original  definition  for  g *,  i.e., 

(4-3)  9*(z2)  =  E[f(z1)\z2]  =  f(zl), 

because  Z\  is  exactly  known  when  z2  is  given.  Substituting  this  back  in  (3.5)  and 
minimizing  with  respect  to  the  strategy  /,  we  have 

(4.4)  g*(z2)  =  f  (zi)  =  zu 

and  hence 


(4.5)  7i(zi)  =  0, 

(4.6)  72(2:2)  =  zu 

which  is  the  unique  linear  set  of  optimal  strategies.  This  indeed  turns  out  to  be  a 
very  simple  example  of  the  well-known  classical  LQG  problem. 

4.2.  Infinite  transmission  noise  intensity.  Another  limit  case  is  when  the 
transmission  noise  intensity  increases  to  infinity.  In  this  case,  %2\  and  z22  become 
Independent  and  we  have 


(4.7)  p(z2)  =p(z21,z22)  =p(z2i)p(z22). 


The  Fisher  information  term  can  now  be  written  as 

f+oo  /  a  \  2 


7/  (^2)22  =  J  J  lnp  (221,222))  p(z2i,z22)dz2idz22 

=  /  (^lnp(z22))  p(z22)dz22 


(4.8)  =  If  (Z22) , 

which  is  indeed  the  Fisher  information  content  of  z% 2  only.  Hence, 


(4,9)  J*(f)  =  k2E  [(Zl  -  f  (2l))2]  +1-1/  (Z22). 

This  is  the  same  result  that  was  presented  for  the  Witsenhausen  counterexample  in  [1] . 
Intuitively,  when  we  have  infinite  transmission  noise  intensity,  we  might  as  well  deny 
the  access  to  z\  for  the  second  station,  and  this  is  exactly  the  case  In  Witsenhausen’s 
counterexample.  The  optimal  strategies  for  this  case  are  still  unknown.  Witsenhausen 
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showed  that  the  optimal  solution  exists,  even  if  xq  has  a  general  distribution  with  a 
finite  second  moment  [1].  He  then  showed  that  if  one  of  the  strategies  is  restricted  to 
being  affine,  the  other  optimal  strategy  would  also  be  affine.  But  then  he  provided  a 
set  of  nonlinear  strategies  that  could  achieve  a  lower  cost  for  some  values  of  k2  and 

Different  approaches  have  been  taken  in  order  to  find  the  optimal  strategies.  As 
mentioned  before,  an  asymptotic  approach  was  used  in  [2]  for  the  case  where  aQ  is 
small.  More  recently,  in  [5],  [6],  [7]  it  was  shown  how  a  neural  network,  trained  by 
stochastic  approximation  techniques,  can  be  employed  as  a  nonlinear  function  approx¬ 
imator  in  order  to  approximate  f(zt),  It  was  demonstrated  that  the  optimal  f*  (zx) 
may  not  be  strictly  piecewise,  as  was  suggested  by  Witsenhausen,  but  slightly  sloped. 
Some  researchers  have  tried  to  attack  the  problem  numerically  and  use  some  sample 
and  search  techniques  to  find  the  solution.  A  discretized  version  of  the  problem  was 
formulated  in  [8],  which  was  later  shown  in  [9]  to  be  NP-complete  and  computation¬ 
ally  intractable.  It  is  recently  asserted  in  [10]  and  [11]  that  a  global  optimum  would 
be  achieved  by  searching  directly  in  the  strategy  space  using  the  generalized  step 
functions  to  approximate  f  (z%). 

So  far  we  have  shown,  through  a  simple  example,  how  any  uncertainty  in  the 
transmission  of  information  between  the  stations  in  a  distributed  system  can  make 
the  optimal  control  design  very  complicated  and  even  intractable.  Then,  by  consid¬ 
ering  the  two  limit  cases,  we  showed  how  our  example  covers  a  very  wide  range  of 
scenarios.  Namely,  we  saw  that  for  the  noiseless  transmission  case,  the  unique  optimal 
strategies,  which  are  linear  in  the  information,  are  easily  obtained,  whereas  for  the 
infinite  transmission  noise  intensity,  the  optimal  strategies  are  still  unknown.  Now  a 
very  feasible  case  to  investigate  is  when  the  uncertainty  on  the  information  transmis¬ 
sion  is  small.  In  fact,  when  the  transmission  noise  intensity  e  is  small,  one  would  still 
expect  behavior  similar  to  the  noiseless  transmission  case  for  the  optimal  strategies. 
In  the  following  sections,  we  consider  this  case.  Namely,  we  assume  a  small  intensity 
for  vt.  Under  this  assumption,  we  obtain  the  first  few  terms  in  the  expansion  of  the 
performance  index  in  terms  of  £.  We  then  use  the  Hamiltonian  approach  in  order  to 
find  a  necessary  condition  for  the  strategies  that  minimize  the  approximated  cost. 

We  show  that  the  linear  strategies,  with  slightly  different  coefficients  than  the 
corresponding  coefficients  for  the  noiseless  transmission  case,  do  indeed  satisfy  the 
necessary  condition.  This  asymptotic  analysis  not  only  gives  us  insight  on  how  the 
optimal  strategies  change  as  the  transmission  uncertainty  is  introduced  but  also  pro¬ 
vides  us  with  a  better  sense  of  the  complexities  in  the  design  procedure. 

5.  An  expansion  for  the  cost.  Assume  that  the  first  station  communicates 
with  the  second  station  through  a  low  noise  channel.  In  other  words,  the  transmission 
noise  intensity  e  is  assumed  to  be  small.  In  this  section,  we  will  find  an  expansion  for 
the  cost  in  terms  of  e.  For  this  purpose,  we  first  find  an  expansion  for  the  probability 
density  function  of  the  information  available  to  the  second  station,  i.e.,  p(z2).  Then 
we  use  (3.7)  in  order  to  find  the  corresponding  expansion  for  g*  (z2)*  By  substituting 
back  in  (3.5),  we  will  obtain  the  expanded  cost  only  in  terms  of  /. 

The  probability  density  function  for  z2  can  be  written  as 


/+QC 

P(Z22iZ2uZi)dZi 

-oc 

/+oo 

P  (Z22N21,  Zi)  P  (221  |2i)p  (21)  dzi 

*00 


(5.2) 
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/+oo 

P (Z22 |Z1 )  P  (Z21  \zi)p(zi)  dzi 

-DO 

/+OG 

P{Z22\zi)pVt  (Z21  ~  Z\)p{zi)d.Zl 

-CO 


(f21  ~  ZlY 
2e2 


'  exp  - 


inhere  for  (5.3)  we  have  used  the  facts  that  the  cr-fields  generated  by  {z2uzx}  and 
{zi,Vt}  are  the  same  and  z\,  Vtf  and  v2  are  mutually  independent.  At  this  point, 
one  should  note  that  even  though  the  joint  probability  density  function  p  (z22i  z2i  >  ^i) 
can  be  explicitly  expressed  as  in  (5.5),  introduction  into  the  performance  index  shows 
that  determination  of  f  (zi)  still  requires  averaging  over  all  random  variables.  This 
is  another  way  of  looking  at  the  effect  of  a  nonclassical  information  pattern,  which  is 
not  partially  nested.  We  therefore  decide  to  follow  an  asymptotic  approach. 

For  small  €,  we  now  approximate  lnpe  (z2)  by  considering  only  the  first  three 
terms  of  its  expansion  around  e  —  0.  Namely, 

Q  $2  | 

(5.6)  lnp£  (z2)  ^  lnpo  (22)  +  5-  In pe  ( z2 )  c  +  lnp£  (z2)\  ■  e2. 


By  making  the  change  of  variables 


ey  :=  zi  -  z2 1  =>  edy  —  dz\ , 


we  can  write  pe  (z2)  in  the  following  form: 


:  exp  - 


(5.8)  pe  (22)  = 


f+ao  J_ 
J _oo 


where 


It  is  now  clear  that 


(*22  -  Mv))‘ 


■  exp  - 


(z2i  +  ey y 


My)  ~  f(*y  +  z 21). 


(5.10)  po  (22)  =  -j=  exp  f-— — 


2<7qJ 


and  hence 


For  the  first  order  term,  we  have 


;  lnp£  (22) 


1  8  l  s 
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On  the  other  hand, 


(5.13) 

Therefore, 


(5-14)  ^  lnpe  ( z2 )  =  0. 

ae  e=0 

We  could  somehow  expect  this  result.  This  is  because  we  would  expect  the  behavior 
of  pc  (z2)  to  depend  only  on  the  variance  of  the  Gaussian  transmission  noise,  i.e,,  e2. 
Using  (5.14),  we  can  now  obtain  the  second  order  term  as 


'lnpe  (z2)  = 


Pote)  de2 


Pc  te) 


After  some  tedious  but  straightforward  manipulations,  we  get 


qz  ln^  ten  =  ~fa  tei)  +  f"  tei)  tei  -  f  tei))  +  fn  tei)  tei  -  f  tei)f 

^  U=o 

(5.16)  +  2 f  ( Z21 )  ( Z22  -  f  tel))  (~^f)  “  7T2  +  “T* 


We  can  now  obtain  a  second  order  approximation  for  In  p,  (z2)  by  substituting  the 
corresponding  terms  from  (5.11),  (5.14),  and  (5.16)  back  into  the  expansion  (5.6).  In 
the  next  step,  we  substitute  the  expansion  for  lnpe  (z2)  in  (3.7)  in  order  to  find  the 
corresponding  expansion  for  g*  (z2).  Remember  that  g*  (z2)  is  the  optimal  strategy 
for  the  second  station,  assuming  that  the  first  station  has  a  fixed  strategy  7,  (z,)  = 
/  (z,)  —  z\.  We  have 


9 *  te)  =  z22  +  5 —  In  Pte) 

OZ22 

“  + sb : up° M + (£ bp-  H, J 

=  z22  —  (z22  ~  f  (z2l)) 

(5.17)  +e2  /"  (z21)  +  2/'2  (z21)  (z22  -  /  (z21))  +  2/'  (z21)  (-||)  , 

Our  goal  is  to  get  an  expansion  for  the  cost,  which  as  we  know  from  (3.5)  can  be 
written  as 


(5.18) 


J-(f)  =  k2E  [(z,  -  /  (zi))2]  +  E  [(/  (zx))2]  -  E  [(,*  (z2)f  . 
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Using  the  expansion  for  g*  (^)  from  (5.17),  we  have 
E[(g*  (z2))2]cE[(f(Z2l)f] 

(5.19)  +  2 e2E  \f(z21)  (/"  (z21)  +  2 f'2  (z21)  (z22  -  f  (z21))  +  2 f  (z21) 


where  we  have  neglected  the  fourth  order  term  in  e.  Substituting  this  expansion  back 
in  (5.18),  we  will  obtain  the  following  expansion  for  the  cost: 


J*(f)  =  k2E  [(*!  -  /  (zr))2]  +  E  [(/  (^))2]  -  E  [(/  (z21)f  J 

(5.20)  -  2 e2E  f  (z21)  (/"  (z21)  +  2 f12  (z21)  (z22  -  f  (z21))  +  2  f  (z21)  (-^)  ) 


Note  that  when  the  transmission  is  noiseless,  i.e.,  e  =  0  and  therefore  221  =  z\t  we 
have 


(5.21)  J*(f)  —  k2E  [(Zl-f(Zl)f], 

and  f  (z\)  —  z\  is  the  obvious  unique  optimal  solution.  The  above  expansion,  how¬ 
ever,  is  not  exactly  in  our  desired  form  yet.  This  is  because  the  third  term  on  the 
right-hand  side,  which  is  an  average  over  z%u  still  depends  on  e.  We  shall  now  rewrite 
the  expansion  In  (5.20)  by  explicitly  expressing  the  expectations  based  on  the  corre¬ 
sponding  probability  densities: 


t2 

*°adt 


j*(f)  =  /_+J  [fc2  it  -  mf + fit)} 

-  [fit)  +  2e2  (/(*)/"(<)  -  2/(t)/'(t)L) 

/+00  p+00  -j 

/  4e2/(t)/,2(t)  (r  —  f(t))  —=e~^ 

- 00  J —00  V 
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c  2H+‘2)  dt 


-Iz-i/WI2  1  2<T.  , 

2“  — = — e  0  dtdr, 
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mo 0 


where  we  have  substituted  p  (#2)  =  p  (222*221)  -  Po  (22)  in  the  third  term,  since  the 
higher  order  terms  would  be  multiplied  by  e2  and  would  then  be  neglected.  Now  the 
third  term  turns  out  to  be  zero,  because 
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(5.23) 


At  the  same  time,  we  can  expand  the  probability  density  of  221  up  to  the  second  order 
in  e.  It  is  actually  straightforward  to  obtain 


(5.24) 


\/2tt  (ct§ 


+  €2) 


\/27T(To 


Substituting  (5.23)  and  the  above  expansion  back  in  (5.22)  and  neglecting  the  higher 
order  terms  In  e,  we  can  finally  get  the  following  expansion  for  the  cost: 
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J*(f) 


(5.25) 
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The  objective  is  now  to  obtain  the  function  /,  which  minimizes  the  above  ap¬ 
proximated  cost.  In  the  next  section,  we  use  a  variational  approach  in  order  to  find  a 
necessary  condition  for  such  a  function  and  show  how  the  linear  strategies  still  satisfy 
this  necessary  condition. 


6.  Minimizing  the  approximated  cost.  So  far,  we  have  obtained  an  expan¬ 
sion  for  the  cost  assuming  that  the  transmission  noise  intensity  is  small.  We  have,  in 
fact,  approximated  the  cost  by  including  only  up  to  the  second  order  term  in  c.  We 
should  now  try  to  minimize  this  approximated  cost  in  order  to  find  the  asymptotically 
optimal  /*.  Obviously,  this  strategy  would  be  optimal  only  for  a  small  transmission 
noise  intensity.  However,  it  would  still  be  very  helpful  for  the  analysis  of  the  behavior 
of  the  optimal  strategies  when  we  deviate  a  little  bit  from  the  classical  information 
pattern  by  introducing  a  small  communication  uncertainty. 

We  now  use  the  Hamiltonian  approach  in  order  to  find  the  necessary  conditions 
for  the  function  /(£),  which  minimizes  our  approximated  cost.  For  simplicity,  denote 


(6.1) 

*1  (*)  :=  fit). 

(6.2) 

*2 (t)  :=  Xi(t)  =  f'{t). 

(6.3) 

u(t)  :=  x2(t)  =  xi(t)  =  f"(t) 

(6.4) 

p(t)  := 

V2irao 

The  Hamiltonian  is  then  defined  as  [12] 


H  =  k?  (t-  Xl(t))2p(t)  +  e2  Uxx (t)z2(t)4  -  2xi(t)u(t)  +  x2(tp ^  j  p(t) 

\  ao  ao  / 

(6.5)  +Ai(i)a;2(f)  +  \2(t)u(t), 

where  A,  and  A2  are  the  Lagrange  multipliers  that  should  satisfy 
Ai(t)  =  -HX1 

(6.6)  =  (2k2  (t-Xl(t))  -  4e2x2(t)~  -  2e2x1(t)^_^  +  2^u{t))  p(f)> 

\  ao  ao  / 

A  2{t)  =  -HX2 

(6.7)  =  — 4e2n (t)-^p(t)  -  A i(t). 

But  as  we  can  see,  the  Hamiltonian  is  linear  in  u(t)  and  we  actually  have  a  singular 
optimization  problem.  The  singular  surface  will  be  characterized  by  setting  Hu  and 
its  derivatives  with  respect  to  t  equal  to  zero,  that  is, 

Hu  -  -2e2xi(t)p(t)  4-  k2(t)  =  0, 


(6.8) 
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and 

(6.9)  jfHu  =  -2e2x1(t)p(t)  -  2e2xi (t)p(t)  +  A 2(<)  =  0. 
Substituting  p(t )  =  -J?p(t)  and  also  X2  from  (6,7),  we  get 

(6.10)  yHu  =  -2 e2x2(t)p(t)  -  2e2x1{t)\p(t)  -  X1(t)  =  0. 

at  (Jq 

Differentiating  again  and  substituting  Ai  from  (6.6),  we  have 

dP  t 

(6.11)  =  —4 e2u(t)p(t)  +  4€2-^X2(t)p(t)  —  2 k2  (t  —  %i(t))p(t)  —  0, 
Therefore,  the  corresponding  u(£)  on  the  singular  surface  is 


(6-12)  u(t)=X2(t)±-^(t-x1(t)). 

Note  that  the  first  order  generalized  Legendre-CIebsch  condition,  which  is  a  necessary 
condition  for  u(i)  to  be  minimizing  on  the  singular  surface,  is  also  satisfied,  namely, 


(6.13) 


d_ 

du 


<0. 


Therefore,  the  corresponding  x\ (t)  and  x2(t)y  which  minimize  our  approximated  cost, 
should  necessarily  satisfy  the  following  differential  equations: 

(6.14)  xt(t)  =  x2(t), 

(6-15)  ±2{t)=x2(t)-t2-^(t-x1(t)). 

aQ  M 

Since  e  Is  assumed  to  be  small,  we  may  assume  the  following  form  in  order  to  obtain 
the  solutions  for  the  above  differential  equations: 


(6.16)  xi  (£)  =  aQ(t)  4-  e2a2(t)  -I-  €4a4(£)  +  *  *  • , 

(6.17)  ^2{t)  —  bg(t)  4-  €262(t)  4  e464(t)  +  *  *  * . 

Interestingly  enough,  by  substituting  the  above  Xi  and  x2  back  Into  the  differential 
equations  and  comparing  the  coefficients  of  the  terms  with  the  same  order  in  e,  we  get 


(6.18) 


#i(t)  = 


Back  to  our  original  notation,  we  actually  have 


(6.19) 


As  we  can  see,  the  solution  is  still  linear  with  a  coefficient  which  Is  slightly  different 
than  the  corresponding  coefficient  for  the  noiseless  transmission  case.  Remember  that 
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f{zi)  —  z\  is  the  optimal  solution  when  there  is  no  transmission  noise,  and  note  that 
for  e  =  0  in  (6,19)  we  get  exactly  the  same  solution  as  expected.  Given  the  above 
function  f  (z{),  the  corresponding  g*  ( 2 2)  can  easily  be  obtained  using  (3.4),  Note 
that  it  will  also  be  linear  because  of  the  Gaussian  assumption  for  the  underlying 
uncertainties. 

We  could  somehow  expect  the  optimal  strategies  to  be  linear  from  the  beginning. 
As  we  mentioned  in  section  2,  linear  strategies  were  shown  to  be  asymptotically  opti¬ 
mal  for  the  Witsenhausen  example  when  the  uncertainty  on  the  information  available 
to  the  first  station  is  small  [2],  In  this  paper,  however,  we  have  considered  a  refor¬ 
mulation  of  Witsenhausen  Js  problem  where  the  first  station  sends  its  information  to 
the  second  station  through  a  low  noise  channel.  These  two  scenarios  are  somewhat 
similar.  Namely,  in  both  scenarios,  the  second  station  can  determine  the  information 
available  to  the  first  station  fairly  accurately.  Specifically,  in  the  first  scenario,  the 
second  station  almost  knows  zi  because  of  its  small  uncertainty,  while  in  the  second 
scenario  it  can  determine  z\  from  the  information  that  is  transmitted  through  a  low 
noise  channel. 

We  would  also  expect  the  optimal  strategies  to  approach  the  corresponding  strate¬ 
gies  for  the  noiseless  transmission  case  as  the  value  of  z\  and,  in  some  sense,  the 
signal-to-noise  ratio  increases.  This  does  not  seem  to  happen  in  the  solution  (6,19). 
One  may  justify  this  by  looking  at  the  exponential  function  in  the  cost  (5.25),  This 
function  drives  the  integrand  of  the  cost  to  zero  exponentially  fast  for  large  values  of 
zi.  Therefore,  the  structure  of  the  cost  really  does  not  force  the  optimal  solution  to 
approach  /  (z\)  =  z\  as  z\  increases. 

We  shall  now  obtain  the  corresponding  value  of  the  cost.  Substituting  f{i)  from 
(6.19)  back  into  the  cost  (5.25),  we  get 


where  we  have  used 


(6,21) 

(6.22) 


t2 

*%dt  =  al 


2ffe  dt  =  3(Tq. 


The  optimal  cost  for  the  noiseless  transmission  case  is  zero.  But  if  we  use  f  (zi)  =  z\ 
when  the  transmission  is  noisy,  we  get  the  following  cost: 

(6.23)  J*(/)  =  2c2, 
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In  other  words,  If  we  fix  the  strategies  to  be  the  optimal  strategies  for  the  noiseless 
transmission  case  while  introducing  a  small  transmission  noise,  the  increase  in  the  cost 
will  be  proportional  to  the  transmission  noise  intensity.  However,  if  we  use  (6.19),  we 
can  indeed  improve  the  cost  by  the  fourth  order  in  €. 

One  should  note  from  (6.19)  and  (6.20)  that  as  the  value  of  k2t y§  increases,  the 
asymptotically  optimal  solution  approaches  f  (zt)  =  z1}  and  the  change  in  the  cost 
becomes  smaller.  In  other  words,  increasing  k2a2  has  an  effect  similar  to  decreasing 
the  communication  uncertainty.  To  explain  this,  we  note  from  the  performance  index 
that  Increasing  k 2  Implies  a  more  expensive  control  action  for  the  first  station,  which, 
in  turn,  results  in  smaller  m.  This  then  implies  that  the  information  available  to  the 
second  station  Is  less  affected  by  the  action  of  the  first  station.  At  the  same  time, 
increasing  <7q  Implies  a  higher  level  of  uncertainty  on  xq,  which,  Incidentally,  is  the 
piece  of  information  that  is  being  transmitted  between  the  stations. 

This  brings  up  an  example  of  a  very  interesting  fundamental  issue:  the  notion  of 
information  value  and  how  it  could  be  different  for  control  and  communication  pur¬ 
poses.  In  fact,  we  know  from  information  theory  that  a  higher  level  of  uncertainty  for 
a  piece  of  Information  Implies  a  higher  level  of  entropy  and  therefore  a  more  valuable 
piece  of  information  for  transmission.  On  the  other  hand,  however,  a  more  uncertain 
piece  of  information  would  probably  be  less  valuable  for  control  purposes  and  would 
have  smaller  effect  on  the  control  strategies.  In  other  words,  a  control  designer  would 
probably  be  willing  to  spend  less  on  installing  transmitters  on  the  stations  for  commu¬ 
nicating  more  uncertain  pieces  of  information.  While  defining  a  notion  for  the  value 
of  information  for  control  purposes  has  been  occasionally  addressed  in  the  literature 
for  quite  a  long  time,  it  still  remains  an  open  problem.  This  is  mostly  because  of  the 
fact  that  the  value  of  information  for  control  purposes  would  highly  depend  on  how 
the  cost  is  defined  for  the  control  design,  and  this  could  be  quite  different  in  various 
applications. 

7.  Concluding  remarks.  We  analyzed  an  example  of  a  decentralized  stochastic 
system.  This  example  was  a  reformulation  of  the  Witsenhausen  counterexample  where 
the  first  station  was  allowed  to  send  its  information  to  the  second  station  through  a 
noisy  channel.  The  dynamics  were  linear,  all  the  underlying  uncertainties  were  as¬ 
sumed  to  be  Gaussian,  and  the  cost  was  quadratic.  It  was  shown  that  as  soon  as 
any  uncertainty  is  Introduced  In  the  communication  among  the  stations,  the  infor¬ 
mation  pattern  again  becomes  nonclassical,  which  is  not  partially  nested.  We  then 
showed  how  the  performance  index  can  be  alternatively  expressed  such  that  the  pos¬ 
sible  nonconvexity  of  the  cost,  with  respect  to  the  control  strategies,  becomes  more 
transparent.  Therefore,  In  general,  we  will  end  up  with  a  nonconvex  functional  opti¬ 
mization  problem  when  we  try  to  obtain  the  decentralized  optimal  control  algorithms. 
We  then  considered  two  limit  cases.  Namely,  the  case  where  there  is  no  communi¬ 
cation  uncertainty  and  the  case  In  which  the  transmission  noise  intensity  increases 
to  infinity.  The  former  case  was  shown  to  be  a  trivial  example  of  a  classical  LQG 
problem,  whereas  the  latter  case  corresponds  to  Witsenhausen *s  counterexample,  the 
optimal  solution  of  which  is  still  unknown. 

We  then  focused  on  the  case  where  the  communication  uncertainty  was  small. 
We  followed  an  asymptotic  approach  where  we  approximated  the  cost  based  on  its 
expansion  in  terms  of  the  small  transmission  noise  intensity.  We  showed  how  mini¬ 
mizing  the  approximated  cost  can  be  seen  as  a  singular  optimization  problem.  We 
then  used  a  variational  approach  in  order  to  find  the  necessary  conditions  for  the 
asymptotically  optimal  strategies  and  showed  that  some  reasonable  linear  strategies 
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would  actually  satisfy  those  conditions*  We  also  provided  some  intuitive  explanations 
for  the  behavior  of  those  linear  strategies  and  obtained  the  corresponding  cost* 

Note  that  while  we  have  focused  on  the  reformulated  Witsenhausen  counterexam¬ 
ple,  our  main  result  is  quite  general  In  fact,  we  have  shown  through  an  example  that 
communication  uncertainties  in  decentralized  systems  generally  result  in  nonclassi- 
cal  information  patterns,  which,  in  turn,  can  destroy  the  convexity  of  the  associated 
functional  optimization  problems.  Moreover,  our  approach  is  indeed  a  very  general 
approach,  which  have  been  applied  to  various  other  problems  before.  More  specifically, 
expanding  a  cost  function  in  terms  of  some  small  parameters  is  a  common  practice  in 
variational  and  perturbation-based  approaches.  Furthermore,  using  Hamiltonian  ap¬ 
proach  in  order  to  obtain  the  necessary  conditions  for  the  optimal  strategies  obviously 
is  not  specific  to  our  reformulated  Witsenhausen  problem.  However,  finding  the  exact 
function  (6.19),  which  is  obtained  in  closed  form,  satisfies  the  necessary  condition  for 
optimality,  and  shows  how  the  optimal  strategies  could  change  upon  introduction  of 
some  communication  uncertainty,  could  be  very  specific  to  our  problem. 

All  the  derivations  and  the  results  in  this  paper  show  some  of  the  difficulties 
involved  in  dealing  with  decentralized  systems  as  soon  as  we  deviate  a  little  bit  from 
a  classical,  or  at  least  a  partially  nested,  information  pattern.  On  the  other  hand, 
even  though  we  have  modeled  the  communication  uncertainty  in  the  simplest  possible 
way,  we  have  tried  to  emphasize  the  role  of  communication  uncertainties  in  generating 
such  information  patterns  that  are  very  difficult  to  handle. 

Finally,  it  should  be  mentioned  that  even  though  the  optimization  problem  is 
generally  difficult  for  this  class  of  systems,  in  some  applications  one  might  be  able  to 
exploit  the  specific  structure  of  the  system  in  order  to  obtain  some  reasonably  good 
suboptimal  strategies,  which  could  yield  an  acceptable  performance. 
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Abstract 

It  is  known  that  dynamic  allocation  of  channels  and  power  in  a  Frequency  /Time  Division  Multiple  Access 
(FDMA/TDMA)  system  can  improve  performance  and  achieve  higher  capacity.  Various  algorithms  have 
been  separately  proposed  for  dynamic  channel  assignment  and  power  control.  Moreover,  integrated  Dynamic 
Channel  and  Power  Allocation  (DCPA)  algorithms  have  already  been  proposed  based  on  simple  power  control 
algorithms.  In  this  paper,  we  propose  a  DCPA  scheme  based  on  a  novel  predictive  power  control  algorithm. 
The  Minimum  Interference  Dynamic  Channel  Assignment  algorithm  is  employed,  while  simple  Kalman  Filters 
are  designed  to  provide  the  predicted  measurements  of  both  the  channel  gains  and  the  interference  levels,  which 
are  then  used  to  update  the  power  levels.  Local  and  global  stability  of  the  network  are  analyzed  and  extensive 
computer  simulations  are  carried  out  to  show  the  improvement  in  performance,  under  the  dynamics  of  user 
arrivals  and  departures  and  user  mobility.  It  is  shown  that  call  droppings  and  call  blockings  are  decreased 
while,  on  average,  fewer  channel  reassignments  per  call  are  required. 

I.  Introduction 

With  the  ever  increasing  need  for  capacity  in  mobile  radio  systems,  optimal  allocation  of  resources  in  non- 
uniform  and  non-stationary  environments  has  become  a  great  challenge.  The  fundamental  objective  is  to 

accommodate  as  many  users  as  possible,  subject  to  complexity  and  Quality  of  Service  requirements,  on  a 

*This  research  was  supported  in  part  by  the  Air  Force  Office  of  Scientific  Research  under  Grant  Number  F49620-00- 1-0154 

lll500  W.  Olympic  Blvd.,  Suite  398,  Los  Angeles,  CA  90064,  (kambiz5ixmovicsvireless.com}. 
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limited  available  bandwidth  by  controlling  undesired  interactions  among  the  users.  One  major  interaction  is 
the  cochannel  interference  that  every  user  generates  for  all  other  users,  which  are  sharing  the  same  channel 
Various  techniques  have  been  developed  to  mitigate  the  effects  of  cochannel  interference.  Some  of  these 
techniques,  such  as  sectorization  and  beamforming  using  smart  antenna  arrays,  try  to  suppress  interference, 
while  others  such  as  channel  assignment  techniques  try  to  avoid  strong  interferers. 

Another  well-known  technique  is  to  adaptively  control  the  power  levels  of  all  the  users  in  the  network.  The 
idea  is  to  keep  the  power  level  for  every  user  at  its  minimum  required  level  according  to  the  current  channel 
conditions.  This  will  eliminate  unnecessary  interference  to  other  users  and  will  also  minimize  the  power 
consumption  for  the  user.  Various  power  control  algorithms  have  been  proposed  in  the  literature  [1]-[12]. 

Our  first  objective  in  this  paper  is  to  design  a  distributed  predictive  power  control  algorithm.  We  try  to 
obtain  accurate  enough  models  for  the  slow  variations  in  the  channel  gains  and  the  interference  powers.  We 
then  design  Kalman  filters  for  every  user  to  obtain  the  one-step  predicted  values  for  both  the  interference  level 
and  the  user’s  channel  gain  from  its  intended  base  station.  We  try  to  tune  the  filters  for  a  typical  mobile  radio 
environment  and  then  conjecture  and  show  through  simulations  that  the  filters  are  indeed  robust  under  a  broad 
range  of  parameters  such  as  user  velocities  and  shadowing  correlation  distances.  The  predicted  measurements 
from  the  Kalman  filters  are  then  used  in  an  integrator  algorithm  to  update  the  power  levels. 

Another  approach  to  mitigate  the  co- channel  interference  effects  and  increase  the  capacity  is  to  avoid  strong 
interferers  by  dynamically  assigning  the  channels  to  the  users.  Various  centralized  and  decentralized  Dynamic 
Channel  Assignment  (DC A)  schemes  have  been  proposed  in  the  literature  [13]-[16]. 

It  is  believed  that  an  aggressive  DCA  scheme  can  make  an  FDMA/TDMA  system  an  interference-limited 
\  system,  where  the  number  of  active  users  is  mostly  limited  by  the  interference  that  the  users  cause  on  each 

other.  On  the  other  hand,  power  control  schemes  are  known  to  be  especially  effective  for  interference-limited 
systems.  This  has  initiated  research  on  integrated  distributed  Dynamic  Channel  and  Power  Allocation  (DCPA) 
schemes  [17]-[20].  In  [17]  a  pilot  based  minimum  interference  DCA  scheme  is  integrated  with  a  fast  fixed-step 
power  control  algorithm,  while  fast  fading  and  user  mobility  effects  are  neglected.  In  [18]  three  different 
types  of  minimum  interference  DCA  algorithms  are  integrated  with  a  slow  integrator  power  control  algorithm. 
Pedestrian  mobility  along  with  a  low  power  update  rate  are  considered  and  it  is  again  assumed  that  the  fast 
fading  effects  are  averaged  out.  In  [19]  a  simulation  study  has  been  performed  to  investigate  the  joint  effects 
of  some  simple  SIR1 -based  and  signal-level-based  power  control  algorithms  along  with  a  minimum  interference 

1  SIR  denotes  Signal  to  Interference  plus  Noise  Ratio  throughout  this  paper. 
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channel  reassignment  scheme.  Fast  fading  effects  are  again  neglected  and  low  power  update  rates  are  assumed. 

Most  DCPA  schemes,  however,  only  consider  simple  power  control  algorithms.  Moreover,  except  for  [18]- 
[19],  other  results  neglect  such  effects  as  dynamics  of  user  arrival  or  departures,  user  mobility,  and  base  station 
hand-offs.  Our  main  objective  in  this  paper  is  to  investigate  the  performance  of  our  predictive  power  control 
algorithm  when  it  is  integrated  with  a  minimum  interference  DCA  scheme.  We  set  up  a  system-level  simulation 
platform,  similar  to  the  ones  presented  in  [17]-[18],  to  compare  our  predictive  DCPA  scheme  with  the  one  that 
uses  a  simple  integrator  power  control  algorithm  with  no  prediction.  Dynamics  of  user  arrivals  and  departures, 
user  mobility  and  base  station  hand-offs  are  all  considered  in  this  study.  Slowly  varying  flat  Rayleigh  fading 
effects  are  also  considered  in  the  simulations. 

The  organization  of  the  paper  is  as  follows.  In  the  next  section,  we  present  the  system  model  and  review 
some  of  the  results  in  power  control  and  dynamic  channel  assignment.  In  Section  III  we  elaborate  on  our 
predictive  power  control  design.  We  explain  how  simple  Kalman  filters  may  be  designed  and  implemented 
in  order  to  obtain  the  predicted  measurements  of  both  the  channel  gains  and  the  interference  powers.  We 
also  show  that  the  presented  predictive  power  control  algorithm  satisfies  the  sufficient  conditions  for  global 
stability  of  the  network.  In  Section  IV  we  describe  in  detail  our  simulation  models  and  in  Section  V  we  discuss 
the  simulation  results  and  compare  the  performance  of  our  integrated  predictive  DCPA  algorithm  with  the 
corresponding  algorithm  which  uses  no  prediction.  We  show  that,  for  a  range  of  traffic  loads,  the  number 
of  blocked  calls  mid  dropped  calls  are  decreased  under  our  predictive  DCPA  scheme.  Moreover,  on  average, 
fewer  number  of  channel  reassignments  are  required  for  every  call,  implying  a  more  stable  network.  We  will 
provide  concluding  remarks  in  the  final  section. 

IL  System  Model,  Dynamic  Channel  Assignment  and  Power  Control 

We  consider  a  cellular  system  where  the  area  under  coverage  is  divided  into  cells  and  each  cell  has  its  own 
base  station.  All  users  communicate  with  their  assigned  base  stations  through  a  single  hop.  This  is  in  contrast 
to  ad  hoc  wireless  networks  where  there  is  no  fixed  infrastructure  and  multi-hop  communication  is  prevalent. 

We  focus  on  a  Frequency /Time  Division  Multiple  Access  (FDMA/TDMA)  system  and  only  consider  the 
co-channel  interference  among  the  users,  i.e.,  no  adjacent  channel  interference  is  assumed.  Specifically,  we 
assume  a  system-wide  synchronization  to  the  slot  level  so  that  each  user  will  experience  interference  only  from 
the  users  which  are  sharing  exactly  the  same  slot  on  the  same  carrier  frequency.  This  assumption  implies 
that  large  enough  guard  times  per  slot  are  assumed.  We  do  not  consider  any  blind  slots  in  the  system,  that 
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is,  we  assume  that  any  slot  in  a  frame  can  be  used  as  a  traffic  channel.  Blind  slots  can  be  avoided  either  by 
appropriate  structuring  of  the  control  channel  or  by  assuming  that  a  call  activity  detection  scheme  is  employed 
such  that  the  users  can  temporarily  discontinue  their  transmission  in  their  active  slots.  Modifying  the  frame 
structure  and  considering  some  slots  as  the  blind  slots  should  not  have  major  effects  on  our  performance 
comparisons. 

We  focus  on  the  uplink  channel,  Le,,  the  channel  from  mobiles  to  base  stations.  Almost  all  the  results 
and  discussions,  however,  could  similarly  be  stated  for  the  downlink  channel.  We  assume  a  fixed-power  pilot 
(control)  channel  on  the  downlink.  As  we  shall  see,  this  channel  facilitates  Dynamic  Channel  Assignment 
(DCA)  and  can  be  used  by  the  mobiles  for  initial  base  station  assignments  and  base  station  hand-offs. 

We  abstract  the  system  architecture,  as  far  as  modulation,  coding,  etc,  are  concerned,  and  consider  SIR 
as  the  only  measure  for  Quality  of  Service  (QoS)  in  the  system.  This  is  a  common  practice,  even  though  Bit 
Error  Rate  or  Frame  Error  Rate  are  usually  seen  as  the  ultimate  performance  measures.  The  reason  is  that, 
in  general,  higher  SIR  will  result  in  better  bit  error  rate  performance  and  considering  SIR  as  the  measure  for 
quality  of  service  provides  us  with  a  more  convenient  platform  for  power  control  design. 

The  received  SIR  on  an  assigned  uplink  channel  for  user  i  can  now  be  written  as: 

r.  _  SuPi  m 

9ijPj  +  Vi 

where  Pi  is  the  transmit  power  for  user  i,  g#  is  the  channel  gain  (or  attenuation)  from  user  i  to  its  intended 
base  station  (in  the  linear  scale),  gij  is  the  channel  gain  from  user  j  to  the  intended  base  station  of  user  i  and 
f}i  is  the  receiver  noise  intensity  at  the  intended  base  station  of  user  i.  Also  M  is  the  total  number  of  users 
\  sharing  the  channel.  We  now  review  the  minimum  interference  dynamic  channel  assignment  scheme  along 

with  the  main  approaches  for  power  control. 

A.  Dynamic  Channel  Assignment 

Under  a  Dynamic  Channel  Assignment  (DCA)  scheme,  all  base  stations  have  access  to  all  the  channels  and 
dynamically  assign  the  channels  to  the  users  based  on  the  current  traffic  conditions.  While  DCA  schemes  are 
clearly  more  complicated,  they  usually  result  in  higher  capacity. 

We  adopt  a  distributed  Minimum  Interference  DCA  scheme  [15].  In  this  scheme,  the  new  users  will  be 
assigned  to  the  idle  channels  with  minimum  local  mean  interference,  in  the  order  they  arrive.  It  was  shown 
in  [26]  that  when  a  new  user  is  admitted  to  a  power-controlled  network,  the  optimal  power  level  for  the  new 


5 


user  can  be  written  as: 


Pn  = 


PnQ  Tn 
9nn  1  “ 


7mai 


(2) 


where  7n  is  the  SIR  threshold  that  the  new  user  wants  to  achieve,  jmax  is  the  maximum  achievable  SIR  for 
the  new  user  and  In0  is  the  local  mean  interference  plus  noise  level  at  the  intended  base  station  of  the  new 
user  before  it  is  admitted  to  the  network.  It  is  now  clear  that  the  minimum  interference  DCA  scheme  does 
indeed  result  in  the  minimum  transmit  power  for  the  new  user. 

Whenever  the  local  mean  SIR  for  a  user  drops  below  a  given  threshold  while  the  user  is  transmitting  at  its 
maximum  power  level,  a  channel  reassignment  attempt  is  triggered  and,  if  possible,  the  user  is  reassigned  to 
the  idle  channel,  which  currently  has  the  minimum  local  mean  interference.  Note  that  this  is  a  distributed 
scheme,  which,  in  general,  is  not  globally  optimal.  Remember  that  any  kind  of  global  optimality  in  the  channel 
assignments  can  only  be  achieved  through  centralized  algorithms,  which  are  usually  impractical  due  to  the 
excessive  requirements  for  processing  and  also  communications  among  the  base  stations. 

Another  issue  is  call  management  and  admission  control.  As  we  shall  see,  a  network  should  be  feasible  for 


every  user  to  be  able  to  achieve  its  desired  SIR  threshold.  If  no  admission  control  is  employed,  a  new  user  could 
potentially  force  the  network  out  if  its  feasibility  region  and  hence  result  in  dropping  active  calls.  Therefore, 
an  admission  control  mechanism  is  needed  to  adjust  the  trade-off  between  blocking  new  calls  and  dropping 
active  calls.  In  [21]  an  admission  algorithm  was  presented  for  a  power  controlled  system,  where  the  new  users 
would  increase  their  powers  only  in  small  steps.  It  was  shown  how  this  scheme  could  protect  the  quality  of 
active  links  when  new  users  arrive.  Channel  probing  techniques  were  later  proposed  in  [22]- [24],  where  a  new 
user  would  try  to  estimate  the  maximum  SIR  level  that  it  can  achieve  by  disturbing  the  network  as  little  as 
possible.  The  user  will  then  be  admitted  only  if  its  maximum  achievable  SIR  is  above  its  desired  threshold. 
Also  a  channel  partitioning  scheme  was  presented  in  [25]  where  a  combination  of  dynamically  allocated  and 
fixed  assigned  channels  are  incorporated  to  develop  a  rapid  distributed  access  algorithm. 

We  adopt  the  simpler  threshold-based  implicit  admission  control  scheme,  presented  in  [18].  In  this  scheme, 
a  new  user  with  a  desired  SIR  threshold  7^  will  be  admitted  only  if  there  exists  an  idle  channel,  on  which 
it  can  achieve  an  SIR  threshold  jneWi  which  is  higher  than  by  a  given  protection  margin.  The  value  of 
the  protection  margin  for  new  users  should  be  selected  based  on  the  trade-off  between  blocking  new  calls  and 
dropping  active  calls. 

Moreover,  a  channel  reassignment  attempt  will  be  triggered  for  a  user  if,  while  transmitting  at  the  maximum 
power,  its  local  mean  SIR  drops  below  a  threshold  7min,  which  is  lower  than  jd  by  another  given  margin.  This 
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margin  is  required  to  avoid  excessive  number  of  channel  reassignments.  The  value  of  this  margin  should  be 
selected  according  to  the  trade-off  between  quality  of  service  and  the  average  number  of  channel  reassignments 
per  call.  Note  that  for  channel  reassignment,  it  is  checked  whether  the  user  can  achieve  7 d  on  the  idle  channel 
which  currently  has  the  minimum  interference.  Since  7 d  <  jnewi  this  scheme  clearly  favors  the  active  users, 
that  are  being  reassigned,  to  the  new  incoming  users.  If  a  channel  reassignment  fails,  the  user  stays  on  its 
old  channel  and  the  reassignment  attempt  is  repeated  every  reassignment  period  (as  long  as  r  <  7 min  and 
p  =  Pmax)  until  the  user  is  either  successfully  reassigned  or  dropped  from  the  network.  Finally,  a  user  will  be 
dropped  from  the  network  if  its  local  mean  SIR  drops  and  stays  below  a  threshold  7 drop(<  Imin)  for  a  given 
duration  of  time. 


\ 


B,  Power  Control 


While  DCA  schemes  achieve  higher  levels  of  capacity  by  dynamically  distributing  the  traffic  across  the 
channels,  power  control  techniques  focus  on  every  channel  and  try  to  mitigate  the  co-channel  interference  by 
dynamically  adjusting  the  power  levels  of  the  co-channel  users  at  their  minimum  required  levels.  Therefore 
one  can  reasonably  expect  that  integrating  power  control  with  DCA  can  achieve  even  higher  levels  of  capacity, 
even  though  the  capacity  gains  may  not  be  exactly  additive  due  to  some  redundancy  between  the  two  schemes 
[18]- 

A  widely  studied  approach  for  power  control  is  the  SIR  threshold  approach,  presented  in  [4],  where  the 
objective  is  for  the  SIR  of  each  user  in  the  network  to  be  above  a  desired  threshold,  that  is: 


U  = 


QiiPi 


Ttfrimm +nt 


>7 1 


(3) 


A  necessary  and  sufficeint  condition  for  the  existence  of  the  optimal  power  levels  pj,  that  satisfy  the  above 
set  of  inequalities,  is  called  feasibility .  In  other  words,  a  network  of  users  is  called  feasible  if  every  user  can 
achieve  its  desired  SIR.  It  was  shown  in  [4]  that  a  network  is  feasible  if  and  only  if  p(T(Z  -  I))  <  1,  where 
Z  =  [zij]  =  r  =  diag(7i,...,7M),  U  =  [u*]  =  [^*],  and  I  is  the  identity  matrix,  and  p  denotes 

the  spectral  radius  of  a  matrix.  Furthermore,  under  the  feasibility  condition,  the  following  simple  iterative 
algorithm,  which  could  be  implemented  in  a  distributed  manner,  would  converge  to  the  optimal  power  levels: 

Pi(n)  =  ^-\  Y^SaPjin  -  1)  +  Vi  ]  =  =  pdn  -  1)-^-,  (4) 

9u  J  9u  n(n) 

where  U(n)  is  the  total  interference  plus  noise  power  at  the  receiver  of  the  intended  base  station  for  user 

i.  Therefore,  every  user  only  needs  a  measurement  of  its  own  channel  gain  and  its  total  interference  plus 


7 


noise  in  order  to  update  its  power  level.  Note  that  Ii(n)  depends  on  the  power  levels  of  the  users  during  the 
(n  -  l)-th  power  update  period.  Also  no  extra  delays  are  assumed  for  processing  and  propagation.  Various 
generalizations  of  this  algorithms  have  been  presented  in  the  literature.  A  unified  framework  along  with 
convergence  analysis  for  some  of  these  algorithms  were  presented  in  [5]. 

In  most  of  these  algorithms,  it  is  assumed  that  all  the  channel  gains  stay  constant  for  the  duration  of  the 
convergence  of  the  algorithm.  Therefore,  it  is  implicitly  assumed  that  the  fading  rate  of  the  channel  is  much 
slower  than  the  power  update  rate.  In  other  words,  neither  the  channel  gain  variations  due  to  user  mobility 
and  fading,  nor  the  measurement  errors  are  taken  into  account.  It  was  recently  shown  in  [6]  that  the  optimal 
powers  obtained  from  the  SIR  balancing  approach,  under  constant  gain  assumptions,  are  very  close  to  the  the 
optimal  powers  that  minimize  the  Rayleigh  fading  induced  outage  probability  for  every  link. 

Some  researchers  have  tried  to  analyze  and  possibly  modify  the  power  control  algorithms  to  take  into 
account  the  channel  gain  variations  and  the  fading  induced  measurement  errors.  In  [7]  it  was  shown  how  the 
desired  SIR  for  the  users  may  be  scaled  up  to  guard  against  the  user  mobility  effects.  In  [8]  a  simulation 
study  was  performed  to  investigate  the  user  mobility  effects  on  slow  integrator  power  control  algorithm.  In 
[9]  a  modification  of  the  distributed  SIR  balancing  algorithm  was  proposed,  which  was  less  sensitive  to  SIR 
measurement  errors.  Also  in  [10]  stochastic  measurements  were  incorporated  in  the  power  control  algorithm 
and  it  was  shown  that  the  power  levels  converge,  in  the  mean  square  sense,  to  the  optimal  power  levels.  More 
recently,  it  was  shown  in  [11]  how  a  simple  Kalman  Filter  may  be  designed  to  smooth  out  the  interference 
measurements.  Also  in  [12]  it  was  mentioned  how  a  minimum-variance  power  control  algorithm  may  be 
designed  when  the  channel  gain  variations  are  modeled  by  filtered  white  noise  sequences.  Despite  all  this 
effort  towards  analysis  and  design  of  power  control  algorithms  in  non-stationary  environments,  most  of  the 
results  fail  to  provide  a  systematic  approach. 

An  alternative  approach  is  to  formulate  the  power  control  problem  as  a  decentralized  regulator  problem, 
where  the  objective  is  for  the  SIR  of  every  user  to  track  a  desired  threshold,  while  the  channel  gains  and 
the  interference  levels  are  changing  with  time  and  the  SIR  measurements  can  be  erroneous.  Based  on  this 
approach,  concepts  and  design  methodologies  from  control  theory  have  already  been  used  for  the  analysis  of 
some  power  control  algorithms  [27]  and  design  of  new  algorithms  [12][28]. 

We  first  note  that,  in  the  logarithmic  scale,  the  distributed  iterative  algorithm  in  (4)  is  a  simple  unity  gain 
integrator  algorithm  in  a  closed-loop.  Using  a  bar  on  the  variables  to  indicate  values  in  dB  or  dBm,  we  can 
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Pi(n)  =  Pi(n  - 1)  +  (7 i  -  n(n))  =  pf(n  -  1)  +  e{(n),  (5) 

where  p,(n)  is  the  power  level  in  dBm  for  user  i  for  the  duration  of  the  n-th  power  update  period  and  fj(n) 
is  the  SIR  in  dB  for  the  same  user  at  the  beginning  of  the  n-th  power  update  period: 

fi(n)  =  pi(n  -  1)  +  §u(n)  -  Ii(n)  (6) 

Moreover,  7,  (n)  is  the  measured  local  mean  interference  plus  noise  power  in  dBm,  available  at  the  beginning 
of  the  n-th  power  update  period: 

Ii(n)  =  101og10  +7?i(n)j  .  (7) 

The  block  diagram  for  a  single  loop,  associated  with  a  single  user,  is  shown  in  Figure  1,  The  controller  transfer 
function  in  this  case  is: 

K(a~  1'i  —  2  —  1  (q\ 

t{q  }  Eiir1)  1  —  q~x  5  (8) 

where  q  is  the  shift  operator.  Therefore,  the  network  can  be  seen  as  a  set  of  interconnected  local  loops.  It 
should  be  realized  that  the  couplings  among  the  local  loops  is  through  the  interference  function  (7),  which, 
in  general,  is  nonlinear.  The  decentralized  regulator  formulation  of  the  power  control  problem  can  now  be 
presented  as  the  following:  n Design  a  set  of  local  controllers  Ki(q~l)  such  that  the  SIR  for  every  user ,  f*, 
tracks  a  desired  threshold  7*  with  a  certain  performance  while  the  global  network  remains  stable 
The  local  loops  in  Figure  1  are  quite  general  and  can  be  modified  to  accommodate  different  modeling 
assumptions.  For  example,  extra  delay  blocks  may  be  inserted  in  the  feedback  path  to  model  processing  and 
propagation  delays.  Moreover,  a  saturation  block  may  be  inserted  in  the  forward  path  after  the  controller  to 
model  the  maximum  and  minimum  power  constraints.  Also  we  have  implicitly  assumed  a  linear  time  invariant 
controller  by  writing  Ki(q~l).  However,  in  general,  the  controller  itself  can  be  a  nonlinear  block,  as  is  the  case 
for  Fixed-Step  power  control  algorithms.  Unfortunately,  analysis  of  stability  and  convergence  of  the  algorithms, 
designed  via  this  approach,  can  be  very  complicated.  Both  local  and  global  stability  for  the  network  should 
be  analyzed  while  feasibility  of  the  network  and  its  implications  should  be  addressed. 

The  global  stability  of  the  network  implies  that  all  the  local  loops  are  stable,  but  the  reverse  is  not  necessarily 
true.  It  was  shown  in  [26]  that  as  long  as  the  network  stays  feasible,  i.e.,  the  channel  gain  variations  do  not 
force  the  network  out  of  its  feasibility  region,  a  sufficient  condition  for  global  stability  of  the  network  is: 


1^(3  )|I/00_indticed  — 


where  G{(q  *)  is  the  transfer  function  from  the  interference  /*(«)  to  power  p,(n  -  1)  and  the  l0 0  -  induced 
norm  for  the  single-input-single-output  system  can  be  obtained  as: 

I I^^IL— induced  =  IWIi  =  £  lft(*)l ,  (10) 

ft= 0 

where  gt  denotes  the  impulse  response  associated  with  the  transfer  function  Gi. 

Hence  if  the  local  loops  are  stable,  and  if  the  feasibility  condition  is  not  violated  and  (9)  is  satisfied  for 
all  local  loops,  then  the  network  will  be  globally  stable  in  the  sense  that  the  deviations  of  the  power  levels 
from  their  corresponding  optimal  values  will  always  remain  bounded.  It  was  also  shown  in  [12]  that  if  the 
channel  gains  are  constant  and  the  network  is  feasible  (i.e.,  a  fixed  optimal  power  vector  P*  exists)  and  if  the 
interference  function  (7)  is  linearized  around  P*,  then  all  small  deviations  of  the  power  levels  in  the  network 
from  their  corresponding  optimal  values  will  asymptotically  converge  to  zero  if: 

||G'i(g“1)||<j_imJuce(i  =  sup  | Gi  {eju)\  <  1.  (11) 

The  above  condition  is  indeed  a  sufficient  condition  for  global  stability  of  the  linearized  network  in  the  i2  - 
induced  norm  sense,  while  (9)  gives  a  sufficient  condition  for  global  stability  in  the  too  -  induced  norm  sense 
without  any  linearization  or  any  constant  gain  assumption. 

III.  Predictive  Power  Control 

Our  objective  in  this  section  is  to  show  how  simple  models  for  the  variations  in  the  channel  gains  and  the 
interference  levels  may  be  used  in  designing  simple  Kalman  filters,  that  provide  predicted  measurements  for 
both  the  channel  gains  and  the  interference  levels  while  they  mitigate  the  effects  of  the  fast  fading  induced 
measurement  errors. 

We  are  assuming  that  the  received  SIR  measurement  or  the  power  command  are  sent  back  to  the  transmitter. 
In  other  words,  we  are  considering  information-feedback  closed-loop  power  control  algorithms.  Due  to  the 
limitations  on  the  control  bandwidth  and  on  the  processing  time,  information-feedback  algorithms  usually  run 
at  slower  power  update  rates.  Therefore,  similar  to  DCA  algorithms,  they  operate  on  the  local  mean  values, 
which  are  obtained  through  some  sort  of  averaging  of  the  measurements  over  some  relatively  long  periods. 

A .  Models  for  Variations  in  Channel  Gains  and  Interference  Levels 

The  variations  in  the  channel  gains  can  be  characterized  by  the  slowly  changing  shadow  fading  and  the  fast 
multipath  fading  on  top  of  the  distance  loss.  We  consider  log-normal  shadowing  whose  spatial  (or  temporal) 
correlation  is  represented  with  a  simple  first-order  Markov  model  presented  in  [30]. 
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The  channel  gain  from  every  user  i  to  its  intended  base  station,  in  the  logarithmic  scale,  is  therefore  modeled 


as: 

9u(n)  =  g%  +  Sgu(n) 

(12) 

Sgu(n)  =  aSgu(n  -  1)  +  wg(n  -  1), 

(13) 

where  g%  is  a  constant  bias  and  w3  is  a  zero  mean  white  Gaussian  noise  sequence.  The  constant  bias  accounts 
for  the  antenna  gains  and  the  distance  loss  in  the  filter.  The  parameter  a  is  obtained  as: 


a  =  e  ,  (14) 

where  v  is  the  user  velocity  and  T  is  the  update  period.  Note  that  vT  is  the  distance  that  the  user  moves 
during  one  update  period.  Moreover  X8  is  called  the  shadowing  correlation  distance.  It  is  the  distance  at 
which  the  normalized  correlation  decreases  to  e-1.  To  see  this,  note  that  the  autocorrelation  function  for  Sg 
can  be  obtained  as: 

2 

Rss(m)  =  E  [6g(m  +  n)6g(n)]  =  a|m|  =  afaM,  (15) 

where  aWg  denotes  the  standard  deviation  of  the  noise  sequence  wg.  Note  that  given  the  standard  deviation 
for  shadowing  as  and  the  value  for  a,  the  standard  deviation  for  the  driving  white  noise  sequence  can  be 
obtained. 

In  order  to  design  distributed  algorithms,  we  need  to  decouple  the  local  loops  in  the  network.  For  this 
purpose,  the  interference  plus  noise  should  be  modeled  independently  for  every  user.  One  approach  is  to  treat 
interference  plus  noise  simply  as  a  bounded  disturbance  for  every  user  and  design  the  power  control  algorithm 
based  on  the  worst  case  considerations.  However,  we  decide  to  model  the  interference  plus  noise,  similar  to 
the  channel  gains,  by  white  noise  driven  first-order  Markov  variations  on  top  of  a  constant  bias.  That  is: 

/,(«)  =  If  +SIi(n)  (16) 

Sli(n)  =  aSli(n  -  1)  +  wjln  -  1),  (17) 

where  wj  is  a  zero-mean  white  Gaussian  noise  sequence  independent  of  wg,  but  with  the  same  variance.  While 
this  model  may  not  exactly  capture  the  slow  variations  in  the  interference  in  a  power-controlled  system,  it 
can  still  be  reasonable  when  such  slow  fluctuations  in  the  interference  levels  are  dominated  by  shadow  fading. 
Note  that,  putting  aside  the  changes  in  the  transmit  power  levels,  due  to  power  control,  the  fluctuations  in  the 
channel  gains  and  interefernce  levels  basically  result  from  the  same  physical  phenomenon.  We  therefore  use 
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this  model  in  a  Kalman  Filter  to  obtain  the  one-step  predicted  measurements  of  the  local  mean  interference 
values. 

Note  that  one  shall  use  receiver  diversity  techniques  to  combat  fast  fading,  since  power  control  algorithms, 
in  general,  cannot  track  very  fast  channel  variations.  While  we  will  evaluate  the  simulated  performance  of 
our  algorithm  with  higher  power  update  rates,  we  decide  to  select  the  power  update  period  such  that  the  fast 
multipath  fluctuations  are  averaged  out  while  the  slower  shadowing  fluctuations  are  being  tracked.  It  was 
shown  in  [31]  that,  under  the  flat  Rayleigh  fading  assumption,  when  a  first  order  low-pass  filter  or  simply 
a  moving  average  filter  is  used  to  obtain  the  local  mean  values  of  the  measurements,  the  averaging  error  in 
dB  will  have  a  Gaussian  distribution,  whose  mean  can  be  made  zero  by  appropriate  choice  of  the  filter  DC 
gain  and  whose  standard  deviation  depends  on  the  shadow  fading  standard  deviation  <rs,  the  ratio  of  the 
shadowing  correlation  distance  to  the  carrier  wavelength  Xs/X,  and  the  normalized  measurement  time  fmT, 
where  fm  =  v/X  is  the  maximum  Doppler  frequency. 

It  is  now  clear  that  the  model  parameters  not  only  depend  on  the  environment  through  the  values  of  the 
shadowing  standard  deviation  and  the  shadowing  correlation  distance,  but  also  depend  on  the  user  velocity. 
While  one  can  think  of  implementing  individual  adaptive  Kalman  filters  for  each  user,  where  the  model 
parameters  are  continuously  updated  based  on  the  available  information  about  the  user  velocities,  we  choose 
to  consider  a  fixed  model  to  design  and  implement  the  same  filters  for  all  the  users  in  the  network.  There  are 
two  main  reasons  for  this.  One  is  that  for  a  rather  broad  range  of  user  velocities,  the  values  for  a  and  <rWg, 
and  as  shown  in  [31],  the  averaging  error  variance  only  slightly  change  and  we  believe  that  the  Kalman  filters 
will  be  robust  to  such  changes.  The  other  reason  is  that  while  some  techniques  have  been  already  proposed 
for  user  velocity  estimation  in  mobile  environments  (refer  to  [32]  and  the  references  therein),  most  of  them 
fail  to  provide  accurate  estimates  in  real  time. 

B.  Kalman  Filter  Design 

Using  a  set  of  available  measurements,  corrupted  with  Gaussian  noise,  a  Kalman  filter  recursively  obtains 
the  minimum  mean  squared  error  estimates  of  a  set  of  variables  that  are  varying  according  to  a  given  dynamic 
model.  Kalman  filters  have  proved  to  be  strong  estimation  tools  in  a  very  wide  range  of  applications  [33],  As 
examples  of  applications  in  communication  systems,  Kalman  filters  have  been  used  for  channel  equalization 
[34] ,  interference  estimation  for  call  admission  in  CDMA  networks  [35]  and  for  power  control  in  packet-switched 
broadband  TDM  A  networks  [9]. 
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We  propose  a  predictive  power  control  algorithm,  where  two  Kalman  filters  are  employed  to  provide  the 
one-step  predicted  estimates  of  both  the  channel  gains  and  the  interference  levels  for  every  user,  which  are 
then  used  in  an  integrator  algorithm  to  update  the  power  levels.  Using  (12)  and  (13)  for  the  channel  gains, 
we  can  write: 

9u(n)  =  agu(n  -  1)  +  (1  -  a)g%  +  wg(n  -  1).  (18) 

Similarly,  using  (16)  and  (17)  for  the  interference  levels,  we  can  write: 


If  (n)  =  ali(n  -  1)  +  (1  -  a)If  +  w/(n  -  1). 


(19) 


The  idea  is  to  design  two  simple  Kalman  filters  that  use  the  erroneous  local  mean  measurements,  available  to 
every  user,  to  estimate  the  constant  biases  in  the  models  and  provide  the  one-step  predicted  estimates  of  the 
channel  gains  and  the  interference  levels.  As  mentioned 5- the  same  models  are  used  for  all  the  mobiles  in  the 
network.  Hence  we  eliminate  the  indices  i  and  ii  for  a  simpler  notation. 

It  is  now  appropriate  to  represent  both  models  in  the  state-space  form.  Define  £$i(n)  =  g(n),  xg2(n)  =  g°, 
xn(n)  =  I(n),  and  xi2{n)  =  J°.  The  state-space  models  for  every  user  can  then  be  obtained  as: 


Xg(n)  =  ^^(n-lHu^n-l), 

(20) 

yg(n)  =  HfXg(n)+vg(n), 

(21) 

X[(n)  =  AfXi(n—l)+wi(n—l), 

(22) 

yi(n)  =  HfXi{n)+vi{n) 

(23) 

where: 


A 


' 

■ 

- 

Xg\ 

A 

Wg 

A 

Wj 

_xg2_ 

?  wg  = 

WgQ 

,  tf/j  = 

mo 

A 


a  1  —  a 
0  1 


(24) 


(25) 


where  wg o  and  wjq  are  two  mutually  independent  fictitious  zero  mean  white  Gaussian  noise  sequences  ,  which 
are  also  independent  from  wg  and  wj.  They  are  required  to  make  the  filters  more  robust  to  the  uncertainties  in 
the  models.  Moreover,  vg  and  vj  are  mutually  independent  zero  mean  white  Gaussian  noise  sequences,  which 
are  assumed  to  be  independent  from  all  other  noise  sequences  in  the  model  and  are  used  to  model  the  fast 
fading  induced  averaging  errors  and  other  possible  uncertainties  in  the  local  mean  measurements.  Remember 
that  all  the  variables  are  expressed  in  a  logarithmic  scale. 
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Now  starting  from  initial  estimates  2^(0)  and  xj(0)  ,  the  measurement  update  equations  for  the  filters  are 
expressed  as: 

Xg(n}+  =  xg(n)- +Lg(n)(yg(n)-HfXg(n)-)  (26) 

xj(n)+  -  xi(n)~  +  Li(n)  (yj(n)  -  HfXj(n)~)  (27) 

where  xg(n)-  and  */(n)_  respectively  denote  the  propagated  (a  priori)  estimates  of  the  channel  gain  and  the 
interference  level  at  the  end  of  the  (n  l)-th  power  update  period.  Hence,  at  time  n  (i.e.,  the  beginning  of  the 
n-th  power  update  period),  the  current  local  mean  measurements  yg(n)  and  yj(n)  are  incorporated  to  obtain 
the  updated  (a  posteriori )  estimates  xg(n)+  and  sj(n)+.  The  two-dimensional  filter  gain  vectors  Lg  and  Lj 
are  obtained  as: 

Lg(n)  =  Pg(n)-Hf  (HfPg(n)-Hj  +  Vg)~1 ,  (28) 

h(n)  =  Pr(n)-Hj  (HfPr(n)~Hj  +  V/)"1 ,  (29) 

where  Vg  and  Vj  axe  the  measurement  noise  covariances  and  Pg(n)~  and  Pj(n)~  are  the  propagated  estimation 
error  covariance  matrices.  Note  that  we  only  have  scalar  measurements  and  no  matrix  inversion  is  involved. 
At  time  n,  the  covariance  matrices  are  updated  as: 

Pg(n)+  =  Pg(n)~  —  Lg(n)HfPg(n)~  (30) 

P/(n)+  =  Pr(n)~  —  Li(n)HfPi(n)~.  (31) 

Now  the  one-step  predicted  estimates  for  the  channel  gain  and  the  interference  level  are  obtained  by  propa¬ 


gating  the  estimates  to  the  next  power  update  period: 

xg{n  + 1)_  =  AfXg(n)+  (32) 

xi(n  + 1)“  =  A/i/(n)+,  (33) 

and  the  covariance  matrices  are  propagated  as: 

P,(n  +  1)-  =  AfPg(n)+Aj  +  Wg,  (34) 

P/(n  +  l)-  =  AfPti^+Aj  +  W,,  (35) 


where  Wg  and  Wj  are  two-dimensional  diagonal  covariance  matrices  for  the  driving  noise  sequences  in  (20) 
and  (22) ,  respectively. 
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Incorporating  the  one-step  predicted  estimates  in  the  integrator  algorithm  (5),  the  updated  power  level  for 
the  duration  of  the  n-th  power  update  period  can  be  obtained  as: 

p(n)  =  p(n  -  1)  +  (7  -  r(n  +  1)“) ,  (36) 

where: 

r(n  +  1)“  =  p(n  -  1)  +  xgl(n  +  1)"  -  xn(n  +  1)“ 

=  p(n-l)  +  g(n  +  l)~  -I(n  + 1)”.  (37) 

When  a  call  is  assigned  (or  reassigned)  to  an  idle  channel,  its  Kalman  filter  estimates  are  initialized  (or  reset) 
as  %i(0)“  =  Xg2(0)~  =  g( 0)  and  £/i(0)“  =  x/2(0}~  =  /( 0),  where  J?(0)  and  1(0)  are  the  local  mean  channel 
gain  and  interference  values  available  at  the  time  of  channel  assignment.  Also  the  error  covariance  matrices 
are  initialized  as  Pg{ 0)“  =  P/( 0)~  =  diag  (cr^aj)  where  a8  is  the  shadow  fading  standard  deviation  (set  to  8 
dB  in  our  simulations). 

We  pick  the  model  parameter  a  according  to  (14)  and  by  considering  the  maximum  user  velocities  that  we 
expect  in  our  mobile  environment.  This  makes  the  filter  assume  the  least  correlation  among  the  local  mean 
values  in  two  consecutive  power  update  periods  and  therefore  rely  more  on  the  measurements.  As  we  shall 
explain  in  our  simulation  details,  we  assume  the  power  levels  to  be  updated  every  100  msec.  Also  we  consider 
the  shadowing  correlation  distance  to  be  about  40m  and  the  maximum  user  velocity  to  be  80  km/hr.  Using 
(14),  we  then  pick  a  =  0.95.  Using  this  value  for  a  and  cr8  =  8  dB  and  (15),  we  get  a%s  =  =  1.56.  We 

choose  to  set  a%g  =  <r%2  =  2.0  in  the  filter,  again  to  deal  with  uncertainties  in  the  models.  The  variances  for 
the  fictitious  driving  noise  sequences  wg0  and  wjo  are  also  set  to  2.0  dB2,  Also  the  standard  deviations  for 
the  local  mean  measurement  errors  are  both  set  to  3.0  dB,  i.e,,  V3  ~  Vj  =  9.0. 

One  should  observe  that  the  error  covariance  matrices  and  the  filter  gains  are  independent  of  the  actual 
measurements.  This  can  be  seen  from  the  filter  equations  (28)-(35).  Therefore,  the  filter  gains  Lg  and  Lj  can, 
in  fact,  be  calculated  and  saved  a  priori.  This  can  result  in  a  significant  reduction  in  the  filter  processing  time. 

Also  note  that  when  the  filter  reaches  the  steady-state  on  a  specific  channel,  the  steady-state  filter  gain 
vectors  are  equal  to: 

Lg  =  Li  =  PHj  ( HfPHj  +  V)“\  (38) 

where  Vg  =  V/  =  V  and  P  is  the  positive-definite  solution  to  the  following  discrete  Riccati  equation: 


P  -  AfPAj  -  A f  PHj (Hf PHt  +  Vy'HfPAf  +  W, 


(39) 
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where  Wg  —  Wj  =  W.  Using  our  selected  values,  we  get: 

1T 

Lg  =  Lj=L=  0.37990  0.37121  *  (40) 

C.  Global  Stability  of  the  Network 

When  the  Kalman  filters  are  employed,  the  block  diagram  for  a  single  loop  can  be  depicted  as  in  Figure  2. 
We  now  show  that,  in  the  steady-state,  the  Kalman  filters  and  therefore  the  local  loops  are  stable2.  Moreover, 
the  sufficient  conditions  for  global  stability  are  satisfied. 

Given  the  filter  gains  in  (40),  it  is  straightforward  to  obtain  the  steady-state  transfer  functions  for  the 
Kalman  filters: 

|(n  +  I)"  _  I(n  + 1)“  _  q  (0.37947?  -  0.36091) 
g(n)  I(n)  ~  g2  -  1.57053? +  0.58909' 

The  poles  of  the  Kalman  filters  (i.e.,  the  poles  of  the  above  transfer  function  or  equivalently  the  eigenvalues 
of  Af  —  AfLHf)  are  located  inside  the  unit  circle  at: 


Sfl  =  0.61928,  sf2  =  0.95125. 


(42) 


It  is  now  clear  that  all  the  local  loops  are  stable,  i.e.,  the  poles  for  all  the  closed-loop  transfer  functions,  associ¬ 
ated  with  a  single  loop,  are  inside  the  unit  circle.  Processing  and  propagation  delays  (i.e.,  extra  delay  blocks  in 
the  feedback  path)  could  result  in  instability  of  the  local  loops  and  therefore  instability  of  the  whole  network. 
However,  even  though  some  delay  compensation  schemes  have  been  proposed  in  [12],  information-feedback 
power  control  algorithms,  as  mentioned  before,  usually  run  on  lower  power  update  rates  and  processing  and 
propagation  delays  are  usually  much  lower  than  a  power  update  period. 

As  we  mentioned,  stability  of  the  local  loops  is  necessary  but  not  sufficient  for  global  stability  of  the  network. 
However,  the  network  will  indeed  be  globally  stable  in  the  ~  induced  norm  sense,  if  the  transfer  function 
from  the  interference  I(n)  to  the  power  p(n  -  1),  satisfies  the  norm  condition  (9). 

Using  (41)  and  from  Figure  2,  it  is  straightforward  to  obtain: 


and  hence  we  get: 


x  Pin  -  1)  __  0.37947?  -  0.36091 

{q)  I(n)  ?2-  1.57053?  +  0.58909 5 


(43) 


induced  -  IIG(?)ili00-induced  =  L0-  (44) 

2Under  the  technical  conditions  of  stabilizability  and  detectability,  the  steady-state  Kalman  filters  are  always  known  to  be 
stable  [33] 
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Therefore  G(q)  satisfies  both  (9)  and  (11).  fVom  (9),  we  conclude  that,  as  long  as  the  network  is  in  its  feasible 
region,  the  deviations  of  the  power  levels  of  all  the  users  in  the  network  from  their  corresponding  optimal  values 
will  always  remain  bounded.  Moreover,  from  (11),  we  conclude  that  if  the  power  levels  only  slightly  deviate 
from  their  optimal  values,  while  the  channel  gains  remain  constant,  they  will  asymptotically  converge  back  to 
their  optimal  values.  This  proves  the  global  stability  of  the  network,  on  every  channel,  both  in  sense  and 
in  £ 2  sense  (with  a  linearized  interference  function),  when  the  Kalman  filters  are  at  their  steady* state. 

When  multiple  channels  are  considered  and  the  power  control  algorithm  is  integrated  with  a  DCA  scheme, 
the  global  stability  analysis  for  the  network  becomes  extremely  complicated.  Average  number  of  channel 
reassignments  per  call  can  be  considered  as  a  measure,  which  can  somehow  show  the  level  of  stability  for  the 
network.  We  show  through  computer  simulations  that  the  average  number  of  channel  reassignments  per  call 
will  be  significantly  reduced  when  the  Kalman  fil ters  are  employed  in  the  power  control  algorithm. 

IV.  Simulation  Model 

While  the  previous  theoretical  analysis  helps  in  justifying  the  use  of  Kalman  filters  in  powder  control  algo¬ 
rithms  to  deal  better  with  the  variations  in  the  channel  gains  and  the  interference  levels  and  also  the  errors 
in  the  local  mean  measurements,  a  simulation  study  is  essential  to  analyze  the  overall  performance  when  such 
a  predictive  power  control  algorithm  is  integrated  with  a  DCA  scheme  in  a  relatively  realistic  mobile  environ¬ 
ment.  We  therefore  set  up  a  system-level  simulation  environment,  similar  to  the  ones  presented  in  [17]-[18] 
but  on  a  smaller  scale,  in  order  to  analyze  the  overall  performance  of  the  network,  when  our  predictive  power 
control  algorithm  is  integrated  with  a  distributed  minimum  interference  DCA  scheme.  User  arrivals  and  de¬ 
partures  and  user  mobility  are  all  considered.  In  this  section,  we  explain  the  details  of  our  simulation  platform 
and  in  the  next  section,  we  analyze  the  results. 

The  simulations  run  on  the  frame  level,  and  hence  only  power  and  interference  levels  are  simulated  and  no 
modulation  and  coding  are  considered  in  the  simulations.  While  we  do  not  restrict  ourselves  to  any  specific 
standard,  we  have  tried  to  stay  close  to  the  Global  System  for  Mobile  Communications  (GSM)  standard. 

A  3x3  square  grid  of  cells  is  assumed.  The  base  stations  are  located  on  the  cell  centers  and  are  separated  by 
800m.  To  avoid  edge  effects,  a  ring  simulation  structure  is  assumed,  i.e.,  the  statistics  are  only  gathered  from 
the  central  cell.  This  is  somewhat  simpler  than  a  toroidal  simulation  structure  and  is  shown  to  provide  more 
optimistic  but  comparable  results  [36].  The  other  reason  for  our  results  to  be  somewhat  optimistic  is  that  only 
nine  cells  are  simulated,  and  therefore  lower  interference  levels  are  generated.  However,  our  simulation  results 
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clearly  serve  our  purpose  of  comparing  our  predictive  DCPA  scheme  with  the  one  that  uses  no  prediction. 
Omni-directional  antennas  with  two  branch  selection  diversity  is  assumed  for  the  base  stations. 

Every  channel  is  characterized  by  a  pair  ( m ,  n)  where  m  denotes  the  carrier  frequency  and  n  is  the  index  for 
the  time  slot.  We  consider  two  carrier  frequencies  and  eight  slots  per  carrier.  As  mentioned  before,  no  blind 
slots  are  considered.  Hence,  there  are  16  available  channels,  all  of  which  can  potentially  be  used  as  traffic 
channels* 

Every  frame  is  4.0  msec,  consisting  of  8  slots,  each  with  a  duration  of  0.5  msec.  It  is  assumed  that  the 
signal  and  interference  power  measurements  for  every  user  are  available  in  every  frame  at  the  end  of  the  user’s 
corresponding  slot.  Various  events  might  then  happen  every  multiple  number  of  frames. 

The  channel  gain  for  every  link  is  normalized  with  respect  to  the  base  station  and  mobile  antenna  gains  and 
is  characterized  by  three  components:  distance  loss,  slow  or  shadow  fading  and  fast  fading.  The  distance  loss 
is  assumed  to  be  inversely  proportional  to  da,  where  a  is  set  to  4.0.  For  shadowing  a  log-normal  pattern  is 
generated  a  priori.  Therefore  the  shadowing  values  only  depend  on  the  user’s  location.  The  resolution  of  the 
shadowing  grid  is  set  to  be  equal  to  the  shadowing  correlation  distance  X„  which  is  assumed  to  be  40m.  The 
shadowing  for  every  user  is  then  obtained  by  a  normalized  bilinear  interpolation  of  the  four  closest  points  on 
the  shadowing  grid.  A  slowly  varying  flat  Rayleigh  fading  is  also  assumed.  This  implies  that  no  line-of-sight 
exists  and  the  delay  spread  is  small  compared  to  the  symbol  duration  or  the  inverse  channel  bandwidth  and 
thus  only  a  single  path  with  a  Rayleigh  distributed  amplitude  (and  hence  exponentially  distributed  power)  can 
be  distinguished.  In  fact,  the  Rayleigh  fading  component  is  assumed  to  be  constant  for  the  whole  duration  of 
a  single  slot  (0.5  msec).  Time  correlation  for  Rayleigh  fading  is  often  represented  using  the  Jake’s  model  [29], 
where  it  is  expressed  in  terms  of  a  zero  order  Bessel  function  of  the  first  kind,  which  results  in  a  non-rational 
spectrum.  We  use  a  first-order  approximation  by  passing  a  white  complex  Gaussian  noise  through  a  first  order 
filter  and  obtaining  the  squared  magnitude  of  the  output  Gaussian  process.  The  time  constant  of  the  filter,  for 
every  user,  is  obtained  by  setting  its  3  dB  cut-off  frequency  equal  to  /m/4  where  fm  =  v/X  is  the  maximum 
Doppler  frequency  for  the  user  [13]. 

New  calls  are  generated  based  on  a  Poisson  process  with  a  given  arrival  rate  Aa,  Each  call  is  assigned  an 
exponentially  distributed  holding  time  with  a  given  average  value  Th.  The  average  Erlang  load  per  cell  is  then 
obtained  as  Ec  =  A aTh/Nc,  where  Nc  —  9  is  the  total  number  of  cells.  The  Erlang  load  per  cell  effectively 
determines  the  average  number  of  users  that  could  be  active  in  every  cell  at  any  instant  of  time.  We  have 
considered  various  combinations  of  values  for  Aa  and  Th  to  simulate  the  network  under  different  traffic  load 
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conditions. 

The  new  users  are  uniformly  distributed  in  the  area.  The  mobility  of  the  user  i  is  modeled  with  a  constant 
but  random  speed  V(  and  the  angle  between  the  velocity  vector  and  the  horizontal  axis  (— tt  <  <  7r).  The 

speed  for  every  new  user  is  selected  randomly  from  a  triangular  distribution  in  the  range  0-80  km/h.  This  is 
preferred  over  a  uniform  distribution,  as  it  results  in  a  smaller  variance  for  the  velocity  distribution  among 
different  users.  The  initial  direction  9  is  uniformly  picked.  Then  every  10  sec,  a  new  direction  is  selected  from 
a  triangular  distribution  with  the  old  direction  as  its  mean.  This  is  again  preferred  over  a  uniform  distribution 
or  a  two  dimensional  random  walk,  since  it  makes  small  angle  turns  more  probable  that  large  ones.  The 
motion  trajectory  for  a  sample  user  is  shown  in  Figure  3, 

The  desired  SIR  threshold  for  all  users  in  the  network  is  set  to  jd  =  12  dB,  while  the  minimum  tolerable 
SIR  is  considered  to  be  7 m*n  =  10  dB,  Both  margins  for  new  user  admissions  and  user  droppings  are  set  to 
2  dB.  Therefore  new  users  will  be  admitted  only  if  they  can  achieve  jn€w  =  14  dB  on  the  idle  channel  with 
the  minimum  local  mean  interference.  Moreover,  a  user  will  be  dropped  from  the  network  if  its  SIR  drops 
below  jdrop  =  8  dB  and  stays  below  for  4,0  consecutive  seconds.  Note  that  these  margins  should  have  been 
expressed  as  percentages  of  7 d  and  7 m*n  for  every  user,  if  the  users  were  to  have  different  quality  of  service 
requirements  and  thus  different  SIR  thresholds. 

When  a  new  user  arrives  into  the  network,  it  first  starts  scanning  the  downlink  control  channel  from  all 
neighboring  base  stations  and  measures  all  the  local  mean  channel  gains.  It  is  assumed  that  this  process  take 
about  0.8  sec  (200  frames),  which  is  called  the  initial  call  set-up  time.  The  new  user  then  sends  its  request 
for  a  channel  to  the  base  station  which  has  the  strongest  signal.  If  this  base  station  does  not  have  any  idle 
\  channels,  the  user  will  try  the  second  best  base  station.  This  procedure  is  called  Direct  Retry  and  will  be 

repeated  for  a  given  number  of  base  stations  (set  to  3  in  our  simulations)  before  the  user  is  blocked.  When 
there  are  idle  channels  available,  the  base  station  checks  whether  the  user  can  achieve  jnew  on  the  idle  channel 
with  the  minimum  local  mean  interference.  If  so,  the  user  will  be  admitted  and  will  be  assigned  to  the  idle 
channel  with  the  minimum  interference.  Otherwise,  the  user  will  be  blocked. 

We  should  note  that  no  macro  diversity  is  considered,  i.e.,  any  user  will  only  communicate  with  a  single 
base  station  at  any  instant  of  time.  Moreover,  base  station  assignment  is  considered  to  be  separate  from  power 
control,  i.e,  the  power  levels  are  obtained  assuming  that  the  users  are  already  assigned  to  their  corresponding 
base  stations.  Joint  base  station  assignment  and  power  control  has  already  been  proposed  in  the  literature 

m- 
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A  minimum  interference  DCA  scheme  is  employed.  The  local  mean  channel  gain  and  interference  values 
for  possible  channel  reassignments  are  obtained  by  simple  averaging  of  the  available  measurements  over  50 
consecutive  frames  for  every  user. 

Finally,  a  base  station  hand-off  attempt  will  be  triggered  if  the  local  mean  channel  gain  from  a  neighboring 
base  station  exceeds  the  corresponding  value  from  the  current  base  station  by  a  selected  hand-off  margin  of 
4  dB.  If  the  hand-off  attempt  fails,  the  user  will  stay  with  its  current  base  station.  Note  that  the  users  are 
assumed  to  be  continuously  monitoring  the  downlink  control  channels  of  all  neighboring  base  stations. 

Two  power  control  algorithms  are  simulated.  Namely,  the  simple  integrator  algorithm  in  (5)  and  (6)  and 
the  predictive  algorithm  in  (36)  and  (37)  are  compared.  Note  that  while  the  propagation  simulation  models 
are  tailored  to  individual  users,  according  to  their  different  trajectories  and  speeds,  the  same  Kalman  filter 
models  and  parameters  are  employed  for  all  the  users  in  the  network. 

After  a  new  user  i  is  admitted,  it  sets  its  initial  power  at: 


P*(0)  = 


idWr_ 

gum-  ’ 


(45) 


where  Ji(0)  and  ga(0)  ,  respectively,  denote  the  local  mean  channel  gain  and  the  interference  plus  noise 
level,  which  are  available  at  the  time  of  user  admission.  Note  that  this  is  somehow  an  optimistic  choice,  since 
a  new  user  sets  its  initial  power  as  though  other  users  will  not  increase  their  transmit  powers. 

For  most  of  the  simulations,  the  power  update  rate  is  assumed  to  be  the  same  for  all  users  and  is  set  to 
100  msec,  that  is,  every  user  updates  its  power  level  every  25  frames.  The  idea  is  to  have  fast  multipath 
fluctuations  averaged  out  while  slower  variations  are  being  tracked.  In  all  simulations,  a  maximum  transmit 
power  constraint  at  30  dBm  is  imposed  on  all  users  in  the  network,  while  the  receiver  noise  floor  is  set  to  -120 
dBm. 


It  should  be  mentioned  that  since  the  users  arrive  at  arbitrary  instants  of  time  according  to  a  Poisson  arrival 
process,  the  power  updates  are  in  fact  performed  asynchronously,  even  though  all  the  users  have  the  same 
power  update  rates.  While  most  results  in  power  control  assume  synchronous  power  updates  among  the  users, 
asynchronous  power  control  algorithms  have  been  addressed  in  the  literature  [5].  To  have  synchronous  power 
updates,  one  could  simply  force  the  users  to  arrive  at  instants  of  time,  which  are  multiples  of  a  common  power 
update  period. 
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V*  Performance  Analysis 

In  this  section  we  present  and  analyze  our  simulation  results  and  show  how  the  predictive  DCPA  scheme 
can  improve  the  overall  performance  of  the  network. 

For  any  given  traffic  load,  we  run  the  simulations  multiple  times  with  different  random  generator  seeds 
and  every  run  continues  until  enough  number  of  calls  are  dropped.  The  statistics  are  then  gathered  from  the 
central  cell. 

Figures  4  and  5  show  the  call  blocking  and  the  call  dropping  responses  of  the  network  under  the  two  DCPA 
schemes.  It  can  be  seen  that  at  7.0  Erlang/Cell,  the  predictive  DCPA  scheme  achieves  about  0.5%  lower 
blocking  rate  and  about  0,03%  lower  dropping  rate.  Moreover  the  improvement  in  performance  increases  as 
the  traffic  load  increases.  Remember  that  there  is  always  a  trade-off  between  blocking  new  calls  and  dropping 
active  calls. 

Our  predictive  DCPA  scheme  also  results  in  better  target  SIR  tracking.  We  obtain  an  estimate  for  the  SIR 
error  standard  deviation  and  also  estimates  for  the  SIR  cumulative  distribution  functions  by  looking  at  the 
local  mean  SIR  values  of  all  the  users  in  the  network  at  various  random  instants  of  time  (after  enough  call 
attempts  have  been  made  and  the  network  has  reached  some  kind  of  steady  state)  during  every  run  of  the 
simulation.  Figure  6  shows  the  standard  deviation  for  the  error  in  the  local  mean  SIR  for  a  range  of  traffic 
loads.  It  can  be  seen  that  the  predictive  scheme  decreases  the  SIR  error  standard  deviation  by  about  0.3 
dB  at  7.0  Erlang/Cell,  while  the  improvement  is  about  0.7  dB  at  10.0  Erlang/Cell.  Furthermore,  Figure  7 
shows  the  cumulative  distribution  for  the  local  mean  SIR  values  in  the  network  under  8.0  Erlang/Cell  and 
10.0  Erlang/Cell.  These  figures  show  how  the  local  mean  SIR  values  for  different  users  are  spread  around 
the  target  SIR  value  7^  =  12  dB.  It  can  be  seen  that  the  predictive  DCPA  scheme  results  in  the  local  mean 
SIR  values,  which  are  less  spread  around  the  target  SIR.  The  improvement  is  again  more  noticeable  in  higher 
traffic  load.  In  fact,  Figure  8  shows  how  the  local  mean  SIR  cumulative  distribution  function  changes  with 
the  traffic  load  under  both  schemes. 

One  measure  that  somehow  shows  the  level  of  stability  of  the  network  is  the  average  number  of  channel 
reassignments  per  call.  Figure  9  shows  this  number  for  a  range  of  traffic  loads  under  both  DCPA  schemes. 
As  one  would  expect,  fewer  channel  reassignments  per  call  are,  on  average,  required  in  the  predictive  DCPA 
scheme.  One  reason  for  this  is  that,  as  shown  before,  the  predictive  scheme  does  indeed  result  in  better  target 
SIR  tracking  and  smoother  local  mean  SIR  behavior. 

We  also  compare  the  transmit  power  distribution  of  the  users  in  the  network  under  the  two  DCPA  schemes. 
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Figure  10  shows  an  estimate  of  the  cumulative  distribution  function  for  the  transmit  powers  of  the  users  in  the 
network  at  the  load  of  8.0  Erlang/ Cell.  It  can  be  seen  that  the  two  schemes  perform  quite  similarly,  as  far  as 
transmit  powers  are  concerned.  In  fact,  both  algorithms  result  in  considerable  power  saving,  when  compared 
with  a  network  where  all  the  power  levels  are  fixed  at  their  maximum  levels.  For  example,  at  a  relatively  high 
load  of  8.0  Erlang/Cell,  about  50%  of  the  users  under  both  DCPA  schemes  are  transmitting  at  0  dBm  or  lower 
power  levels.  It  should  however  be  mentioned  that  our  predictive  DCPA  algorithm  seems  to  result  in  slightly 
higher  power  levels  in  the  network.  While  one  may  see  this  as  a  small  cost  for  better  SIR  tracking  and  better 
call  blocking  and  dropping  responses,  it  should  also  be  noted  that  our  predictive  DCPA  scheme  does  indeed 
result  in  higher  capacity  which  in  turn  implies  more  active  users  at  any  instant  in  time.  This  higher  traffic 
explains  the  higher  average  transmit  power  for  the  users.  In  fact,  Figure  11  shows  how  the  power  cumulative 
distribution  functions  might  change  as  the  traffic  load  on  the  network  changes  under  the  two  DCPA  schemes. 

Finally,  one  might  argue  that  our  power  update  rate  is  too  low  for  the  average  speeds  considered  in  our 
simulations.  In  order  to  further  evaluate  the  performance  of  our  predictive  algorithm,  as  comapred  to  standard 
fast  power  control  schemes,  we  also  simulated  the  DCPA  scheme  with  standard  fixed-step  power  control 
algorithm  where,  depending  on  the  daviation  of  the  received  SIR  from  its  target  value,  the  power  of  each  user 
is  incremented  or  decremented  by  a  fixed  1  dB  step  every  single  frame  (i.e.,  once  per  4  msec).  We  then  ran  the 
same  simulations  with  our  integrated  predictive  DCPA  scheme  where  the  power  of  each  user  is  updated  every 
5th  frame  (i.e.,  once  every  20  msec).  Tables  1  and  2  show  the  call  dropping  and  call  blocking  probabilities 
for  the  two  scenarios  under  two  sample  traffic  loads.  It  can  be  seen  that  the  results  are  similar  wjth  our 
predictive  algorithm  still  performing  slightly  better.  Note  however  that  while  some  additional  computational 
cost  is  associated  with  our  algorithm,  the  update  rate  for  our  algorithm  is  taken  to  be  five  times  slower  than 
the  standard  fixed-step  algorithm.  We  do  however  believe  that  clarification  of  the  exact  trade-off  between  the 
extra  computation  and  the  lower  update  rate  would  require  further  analysis  using  simulations  and,  possibly 
profiling  the  code  on  specific  processors. 

VI.  Conclusion 

A  predictive  Dynamic  Channel  and  Power  Allocation  scheme  was  presented  in  this  paper.  Simple  Kalman 
filters  were  designed  to  obtain  the  predicted  estimates  of  the  local  mean  channel  gains  and  the  local  mean 
interference  plus  noise  levels.  These  predicted  estimates  were  then  incorporated  in  an  integrator  algorithm  to 
update  the  power  levels  of  all  the  users  in  the  network.  It  was  shown  how  generic  models  may  be  used  and 
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filter  parameters  may  be  selected  to  design  the  same  robust  filter  for  all  users.  Local  stability  of  the  network 
was  analyzed.  Moreover  it  was  shown  that  the  sufficient  conditions  for  global  stability  of  the  network  were 
satisfied  when  the  Kalman  filters  were  employed  in  the  power  control  algorithm.  The  global  stability  results 
imply  that,  as  long  as  the  network  stays  feasible,  the  deviations  of  the  power  levels  from  their  corresponding 
optimal  values  will  always  remain  bounded,  while  the  small  deviations  will  always  converge  back  to  zero. 

This  predictive  power  control  algorithm  was  integrated  with  a  Minimum  Interference  Dynamic  Channel 
Assignment  scheme  in  an  FDMA/TDMA  mobile  radio  system.  A  system-level  simulation  environment  was 
then  developed.  User  arrival  and  departures  and  user  mobility  along  with  flat  Rayleigh  fading  effects  were  all 
included  in  the  simulations.  It  was  shown  that  the  predictive  DCPA  scheme  results  in  better  call  dropping 
and  call  blocking  responses  and  also  better  target  SIR  tracking  performance  for  the  network.  Moreover,  on 
average,  fewer  channel  reassignments  per  call  are  required  under  the  predictive  DCPA  scheme.  We  believe 
that  these  improvements  are  obtained  mainly  because  the  predictive  algorithm  takes  into  account  at  least  the 
slow  variations  of  the  channel  gains.  Also  by  dealing  with  uncertainties  in  the  measurements,  it  effectively 
mitigates  the  fading  induced  local  mean  measurement  errors.  It  was  shown  however  that  the  predictive  DCPA 
scheme  results  in  slightly  higher  power  levels  for  the  users  in  the  network. 

As  for  fixture  research,  one  may  try  to  design  adaptive  algorithms  where  the  parameters  of  the  algorithm 
and  even  the  power  update  rates  are  adaptively  adjusted  for  individual  users,  according  to  such  information 
as  user  velocities,  etc.  Also  the  standard  integrator  algorithm  may  not  be  the  best  power  control  algorithm. 
Even  though  constraints  on  complexity  and  computational  effort  are  always  present,  other  simple  algorithms 
may  still  be  designed  that  could  result  in  better  SIR  tracking,  better  allocation  of  resources  and  finally  higher 
v  levels  of  capacity  in  highly  non-uniform  and  non-stationary  environments.  Finally,  analyzing  the  behavior  of 

any  prediction  filter,  both  in  terms  of  convergence  and  performance,  under  bursty  interference  conditions  can 
constitute  another  interesting  line  of  future  research. 
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Abstract 

Power  control  is  considered  as  an  efficient  scheme  to 
mitigate  co-channel  and  multiple-access  interference 
in  cellular  radio  systems.  Various  approaches  have 
been  proposed  in  recent  years  to  design  power  control 
algorithms.  We  focus  on  the  feedback  algorithms  that 
are  based  on  Signal  to  Interference  plus  Noise  Ratios 
(SIR-based  algorithms).  We  review  SIR  threshold  ap¬ 
proach  and  then  discuss  how  power  control  design  can 
be  formulated  as  a  decentralized  regulation  problem. 
We  use  a  robust  control  framework  to  analyze  global 
stability  of  a  network  on  a  single  channel.  We  obtain 
a  sufficient  condition,  which  guarantees  that  the  de¬ 
viations  of  the  power  levels  from  their  optimal  values 
remain  bounded,  even  when  the  channel  gains  change, 
as  long  as  the  network  stays  feasible. 

1  Introduction 

Optimal  allocation  of  transmit  power  levels  in  wireless 
networks  has  attracted  a  lot  of  attention  in  recent 
years.  The  main  idea  is  to  control  the  transmit  power 
level  of  a  user  or  a  base  station  in  a  wireless  system 
in  order  to  maintain  an  acceptable  level  of  quality  of 
service,  while  eliminating  unnecessary  interference  to 
other  users  in  the  network.  Different  objectives  and 
approaches  have  been  perceived  for  power  control  and 
different  algorithms  have  been  naturally  obtained. 

The  major  objective  in  Direct  Sequence  Code  Divi¬ 
sion  Multiple  Access  systems  is  to  mitigate  the  mul¬ 
tiple  access  interference  and  therefore  the  near-far  ef¬ 
fect,  whereas  in  Time/Frequency  Division  Multiple 
Access  systems  the  objective  is  mostly  to  control  the 
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co-channel  interference.  Power  control  will  also  mini¬ 
mize  the  power  consumption  for  the  users  and  hence 
prolong  their  battery  life. 

We  focus  on  power  control  algorithms  that  are  based 
on  Signal  to  Interference  plus  Noise  Ratio  (SIR).  Note 
that,  in  general,  higher  SIR  would  yield  better  bit 
error  performance  and  it  is  therefore  common  to  ab¬ 
stract  the  system  architecture  and  consider  SIR  as  the 
measure  for  quality  of  service  in  order  to  formulate  the 
power  control  objective. 

One  approach  for  SIR-based  power  control  design  is 
SIR  threshold  approach,  presented  in  [1],  where  the 
objective  is  for  the  SIR  of  each  user  in  the  network 
to  be  above  a  desired  threshold.  It  is  shown  how  the 
optimal  powers  could  be  obtained  through  a  simple 
distributed  algorithm.  The  necessary  and  sufficient 
condition  for  the  existence  of  the  optimal  powers  is 
expressed  as  a  feasibility  condition.  Various  general¬ 
izations  of  this  algorithm  were  later  discussed  in  the 
literature.  A  uniform  framework  along  with  conver¬ 
gence  analysis  (under  the  condition  of  feasibility)  for 
many  of  these  algorithms  were  presented  in  [2]. 

In  this  paper,  we  focus  on  the  decetralized  regulator 
formulation  for  power  control  design.  It  has  been  no¬ 
ticed  that  the  distributed  algorithm  presented  in  [1] 
is  simply  an  integrator  algorithm,  in  a  closed  loop, 
on  the  logarithmic  scale.  This  has  initiated  a  the  de¬ 
centralized  regulator  approach  for  power  control  de¬ 
sign  where  concepts  and  design  methodologies  from 
control  theory  have  been  used  for  the  analysis  of  cur¬ 
rent  algorithms  [3]  and  design  of  new  algorithms  [4]  [5]. 
This  approach  could  be  especifically  helpful  if  mod¬ 
els  for  fading,  i.e.,  channel  gain  variations,  are  to  be 
incorporated  in  the  design.  However,  stability  and 


convergence  of  these  algorithms  can  not  be  verified 
through  simple  techniques  such  as  the  one  presented 
in  [2].  Therefore  more  complicated  stability  analysis 
should  be  performed  to  ensure  global  stability  of  the 
network  under  these  power  control  algorithms.  A  ro¬ 
bust  control  framework  was  presented  in  [4],  where  a 
sufficient  condition  for  global  stability  was  established 
using  a  linearized  interference  function.  We  use  a  sim¬ 
ilar  framework  to  obtain  another  sufficient  condition 
for  global  stability  without  any  interference  lineariza¬ 
tion.  This  condition  will  guarantee  that,  under  a  de¬ 
signed  power  control  algorithm,  the  deviations  of  the 
power  levels  in  the  network  from  their  corresponding 
optimal  values  will  always  remain  bounded  even  when 
the  channel  gains  change,  as  long  as  the  variations  in 
the  channel  gains  do  not  force  the  network  out  of  its 
feasibility  region. 


The  organization  of  the  paper  is  as  follows.  In  the 
next  section,  we  present  the  system  model  and  review^ 
the  SIR  threshold  approach.  In  Section  3,  we  review 
the  decentralized  regulator  formulation  for  power  con¬ 
trol  design,  and  in  Section  4,  we  obtain  a  sufficient 
condition  for  global  stability.  Concluding  remarks  are 
provided  in  the  final  section. 


2  System  Model  and  SIR  Threshold 
Approach 

Consider  a  cellular  system  where  M  users  are  sharing 
a  single  channel.  This  channel  could  be  a  frequency 
band  (FDMA),  a  time  slot  (TDM  A)  or  even  a  spread¬ 
ing  code  (CDMA).  Therefore,  for  every  desired  user- 
base  station  link,  there  are  M  —  1  interfering  links. 
The  received  SIR  on  the  uplink  channel  for  user  i  can 
now  be  written  as:  ^  „ 

n  _  EMEi  (2) 

Ejy;  9ijPj  +  m 


where  Pi  is  the  transmit  power  for  user  i,  ga  is  the 
channel  gain  (or  attenuation)  from  user  i  to  its  de¬ 
sired  base  station  (in  the  linear  scale),  gij  is  the  chan¬ 
nel  gain  from  user  j  to  the  desired  base  station  of 
user  i  and  rji  is  the  receiver  noise  intensity  at  the  de¬ 
sired  base  station  of  user  i.  Note  that  even  though  we 
choose  to  focus  on  the  uplink  channel,  a  similar  model 
and  similar  results  can  be  obtained  for  the  downlink 
channel.  Define  the  normalized  channel  gain  matrix 
Z  as: 

2  =  [Zij],  Zij  =  fi  (2) 

9  a 

Note  that  Z  is  a  non-negative  stochastic  matrix  and, 
in  general,  is  not  symmetric. 


7 h  that  is:  r*  >  7*.  It  is  easy  to  show  that  these 
constraints  can  be  written  in  the  matrix  form  as: 

P>T{Z-I)P  +  U  (3) 

where  T  =  diag(yu . . . ,  yM)  and  U  =  [ui]  =  [^]  and 
I  is  the  identity  matrix.  The  necessary  and  sufficient 
condition  for  the  existence  of  a  positive  power  vector 
P,  which  satisfies  the  above  constraint,  is  called  fea¬ 
sibility.  In  other  words,  a  network  of  users  is  called 
feasible  if  every  user  can  achieve  its  desired  SIR.  The 
corresponding  power  vector  is  then  called  a  feasible 
power  vector.  It  is  clear  that  feasibility  of  a  network 
depends  on  all  channel  gains  and  all  desired  SIRs.  In 
SIR  threshold  approach,  the  feasibility  condition  is 
quantified  and  the  minimum  feasible  power  vector  is 
obtained. 

Theorem  2.1  (SIR  Threshold)  Assuming  U  >  0, 
a  network  of  users  is  feasible  if  and  only  if  p(F)  <  1, 
where: 

F±r(Z-I)*fu  =  0,  /«  =  »  (4) 

9  a 

and  p  denotes  the  spectral  radius  of  a  matrix .  Under 
the  feasibility  condition,  the  optimal  power  vector  is 
obtained  as: 

P*  =  (I -F)-XU  (5) 


Matrix  F  is  a  non-negative  (component-wise)  irre¬ 
ducible  matrix  and  the  above  theorem  can  be  proved 
using  some  results  from  the  theory  of  non-negative 
matrices  [6].  The  power  vector  P*  is  optimal  in  the 
sense  that  for  any  other  feasible  power  vector  P,  we 
have  P  >  P*. 

The  above  solution  for  P*  is  a  centralized  solution  in 
the  sense  that  a  central  processor  needs  to  gather  all 
the  information  about  all  the  channel  gains  and  target 
SIRs,  calculate  the  optimal  power  vector  and  send 
back  the  corresponding  power  command  to  every  user. 
It  was  shown  in  [1]  that  a  simple  iterative  algorithm, 
which  could  be  implemented  in  a  distributed  manner, 
would  converge  to  P*.  In  fact,  it  is  clear  that  under 
the  condition  of  feasibility,  the  optimal  power  vector 
P*  is  the  unique  fixed  point  of  the  following  iterative 
algorithm: 

P(n)  =  FP(n-l)  +  tf  (6) 

and  component-wise,  we  can  write: 


In  the  SIR  threshold  approach,  the  objective  is  for  the 
SIR  of  every  user  i  to  be  above  a  desired  threshold 


(7) 


(11) 
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Figure  1:  A  local  power  control  loop,  associated  with  a 
single  user 


where  Ii(n)  is  the  total  interference  plus  noise  power 
at  the  receiver  of  the  assigned  base  station  for  user 
i-  Therefore,  at  the  beginning  of  the  n-th  power  up¬ 
date  period,  the  local  mean  channel  gain  g#  and  the 
local  mean  total  interference  plus  noise  power  Ii(n) 
are  measured  at  the  receiver  and  the  new  power  level 
Pi(n)  is  calculated  and  sent  back  to  the  user.  Note 
that  no  extra  delays  are  assumed  for  processing  and 
propagation.  Moreover,  the  convergence  is  proved  as¬ 
suming  that  all  the  channel  gains  and  the  desired  SIRs 
stay  constant  for  the  duration  of  convergence  of  the 
algorithm.  This  may  not  always  be  a  reasonable  as¬ 
sumption,  especially  if  fast  fading  is  considered  while 
low  power  update  rates  are  assumed.  In  the  next  sec¬ 
tion,  we  discuss  how  power  control  can  be  posed  as  a 
decentralized  regulator  problem. 

3  Decentralized  Regulator  Approach 

Using  a  bar  on  the  variables  to  indicate  the  values  in 
dB,  we  can  write  the  distributed  algorithm  in  (2)  in 
logarithmic  scale  as: 

pi(n)  =pi(n-l)+(ii-fi(n))  =pi(n- l)+gj(n)  (8) 

where  pi(n)  is  the  power  level  in  dBm  for  user  i  for  the 
duration  of  the  n-th  power  update  period  and  r*(n) 
is  the  SIR  in  dB  for  the  same  user  at  the  beginning 
of  the  n-th  power  update  period: 

n{n)  -  pi(n  -  1)  +  gH(n)  -  J^n)  (9) 


function  in  this  case  is: 


Hq-1) 


1 

1-q-1 


where  q  is  the  shift  operator.  Therefore,  the  network 
can  be  seen  as  a  set  of  interconnected  local  loops, 
each  of  which  is  associated  with  a  single  user.  It 
should  be  realized  that  the  couplings  among  the  lo¬ 
cal  loops  is  through  the  interference  function  (10), 
which,  in  general,  is  nonlinear.  The  decentralized  reg¬ 
ulator  formulation  of  the  power  control  problem  can 
now  be  presented  as:  Design  a  set  of  local  controllers 
Ki{q~x)  that  the  SIR  for  every  userf  F*,  tracks  a 
desired  threshold  ji  with  a  certain  performance  while 
the  global  network  remains  stable . 


This  approach  has  already  initiated  research  on  using 
control  theory  concepts  for  power  control  design  [3]- 
[5].  Note  that  the  local  loops  in  Figure  1  are  quite 
general  and  can  be  modified  to  accommodate  differ¬ 
ent  modeling  assumptions.  For  example,  extra  delay 
blocks  may  be  inserted  in  the  feedback  path  to  model 
processing  and  propagation  delays.  In  fact,  one  step 
delay  is  typically  assumed  when  high  power  update 
rates  are  considered  [7].  It  should  also  be  mentioned 
that  we  have  implicitly  assumed  a  linear  time  invari¬ 
ant  controller  by  writing  Ki(q~l),  However,  in  gen¬ 
eral,  the  controller  itself  can  be  a  nonlinear  block,  as 
is  the  case  for  Fixed-Step  power  control  algorithms. 


4  Global  Stability 

Unfortunately,  stability  and  convergence  of  the  power 
control  algorithms,  designed  as  decetralized  regula¬ 
tion  algorithms,  can  not  be  verified  through  simple 
techniques  such  as  the  one  presented  in  [2].  A  robust 
control  framework  was  proposed  in  [4]  to  obtain  a  suf¬ 
ficient  condition  for  global  stability  using  a  linearized 
interference  function.  We  will  use  a  similar  approach, 
but  with  a  different  notion  for  stability,  and  we  obtain 
a  more  general  sufficient  condition  for  global  stability 
without  any  interference  linearization. 


Moreover,  Ii(n)  is  the  local  mean  interference  plus 
noise  power  in  dBm  available  at  the  beginning  of  the 
n-th  power  update  period: 


Ii(n)  =  10log10 


(10) 


It  is  now  easy  to  see  that  this  algorithm  is,  in  fact,  a 
simple  unity  gain  integrator  algorithm  in  a  closed  lo¬ 
cal  loop,  as  shown  in  Figure  1.  The  controller  transfer 


We  consider  a  system  to  be  stable  if  bounded  inputs 
generate  bounded  outputs.  In  robust  control  termi- 
nology  [8]  [9],  we  use  loo  norm  to  quantify  the  size  of 
the  signals  in  the  system  and  ^-induced  norms  to 
quantify  the  amplification  of  the  signals,  i.e.,  the  size 
of  operators  or  transfer  functions.  We  will  obtain  a 
sufficient  global  stability  condition  using  a  fundamen¬ 
tal  stability  result  called  the  Small  Gain  Theorem, 

Theorem  4.1  (Small  Gain  Theorem)  Consider  the 
feedback  loop  in  Figure  2.  Let  Gi  :  £*>  £™  and 


AP(n- 1) 


Figure  2;  Closed-Loop  Stability 


Figure  4;  The  Power-Controlled  Global  Network  in  a 
Variational  Form 


Figure  3:  The  Power-Controlled  Global  Network  (on  a 
single  channel) 


(?2  :  £%  ->■  be  two  stable  operators  and  assume 
that  the  closed  loop  system  is  well-posed  fie.,  for 
any  u±yU2  £  £oo?  there  exists  a  unique  solution  for 
VuV2  €  £oo)-  Then  the  closed-loop  system  is  stable  if 
j  I  1 1  £oo  —induced  IIGalk.  —induced  ^  1* 

Note  that  the  above  therein  only  states  a  sufficient 
condition,  which  may  be  conservative  in  some  cases. 

As  we  mentioned,  a  network  of  users  can  be  seen 
as  a  nonlinearly  coupled  set  of  local  loops.  In  fact 
the  global  network  can  be  depicted  as  in  Figure  3, 
where  G(q-1)  =  diag(Gi(q~1),...,GM(q-1))  is  a 
block  diagonal  closed-loop  transfer  function  matrix 
from  interference  f(n)  to  power  P(n  —  1)  and  /(.)  is  a 
nonlinear  operator,  which  produces  interference  plus 
noise  power  in  dBm  from  the  power  levels.  Note  that 
Gi{q~l)  is  also  equal  to  the  closed-loop  transfer  func¬ 
tion  from  7 i  to  f*.  We  have: 

P(n  -  1)  =  G(g-1)(/(n)  -  g(n)  +  7 (»))  (12) 

where: 

9  =  [ffn  •  •  ■  9mm\T  (13) 

7  =  [7i---7m]T  (14) 

I(n)  =  [l1(P(n-l))...IM(P(n-l))f  (15) 

Now  lets  assume  that  the  network  always  stays  feasi¬ 
ble.  Note  that  we  are  not  assuming  channel  gains  to 


be  constant.  But  we  only  assume  that  the  time  varia¬ 
tions  of  the  channel  gains  do  not  push  the  network  out 
of  its  feasibility  region.  Therefore,  at  any  instant  of 
time,  there  exists  an  instantaneous  bounded  optimal 
power  vector  P*7  which  is  related  to  the  correspond¬ 
ing  optimal  interference  as: 

P*(n  -  1)  =  I*  (n)  -  g(n)  +  7(n)  (16) 

Since  we  are  not  considering  user  arrival  or  depar¬ 
tures,  P*  will  be  constant  as  long  as  the  desired  SIR 
thresholds  and  the  channel  gains  remain  constant.  We 
now  consider  the  deviations  of  the  power  and  inter¬ 
ference  levels  in  the  network,  at  every  instant  of  time, 
relative  to  their  optimal  values,  that  is: 

AP  =  P-P*  (17) 

AI  =  I -I*  (18) 

Using  (12)  and  (16),  we  can  now  write: 

P(n  -  1)  -  G(q~1)P*(n  -  1)  =  G(g-x)Al(n)  (19) 
Hence: 

A P(n  -  1)  =  P(n  -  1)  -  P*(n  -  1) 

=  P(n-l)-G(g-1)P*(n-l)+(G(9-1)-Jd)  P*(n-1) 
=  G(q~1)Al(n)  +  (G(g“1)  -  Id)  P*(n  -  1)  (20) 

where  Id  is  the  identity  matrix.  The  network  can 
then  be  shown  as  in  Figure  4,  where  A pj  is  the  non¬ 
linear  operator  transforming  AP  to  A  I,  We  can  show 
that  A  pi  is  a  contractive  operator  in  the  sense  that 
—induced  ^  1*  To  do  so,  we  use  the  Mean 
Value  Theorem  [10]. 

Lemma  4.2 


II ^-PlWloo— induced  <  I  (21) 


Proof;  Using  (10),  it  is  straightforward  to  show: 


dli 

dpj 


0 


9*3  Pi 


BitPh+rii 


i=j 


(22) 


Remember  again  that  the  variables  without  bar  in¬ 
dicate  values  in  linear  scale.  From  the  Mean  Value 
Theorem,  we  know  that  for  every  i  and  for  every  op¬ 
timal  power  vector  P*,  there  exists  a  power  vector 
P  lying  on  the  line  segment  between  P  and  P*  such 
that,  _ 

_  AP  (23) 

p=p 

Now  using  (22)  and  (23)  and  assuming  IjAPjloo  <  1, 
which  then  implies  |Apj(&)|  <  1  for  alii  =  1, . . . ,  M 
and  k  =  0, 1, . . we  can  write: 


|A/f(fc)|  = 


9ij(k)pj(k  -  1) 


Ej#  9ii(k)Pi(k  -l)+r}i 
9ij{k)Pj(k  - 1) 


<  V  -  _ 

Ei &9ii(k)pi(k-l)  +  iH 

<  9ij{k)fij(k  -  1) 

“  E 9ii(k)pi(k  -  1  )+th 


Ap/fc— 1)|  (24) 
|Api(fc-l)|  (25) 
<  1  (26) 


Therefore:  HAIjloo  =  supfcmaxi  |AJ,  (fe)j  <  1,  and 
hence:  =  supjj^pj|o5<1  IjA/jloo  <  1. 

Note  that  ||Apj||^00_induced  =  1  if  no  receiver  noise  is 
considered  for  any  of  the  receivers. 


It  is  clear  that  stability  of  every  local  loop  is  a  neces¬ 
sary  (but  not  sufficient)  condition  for  global  stability. 
We  are  now  ready  to  state  a  sufficient  condition  for 
global  stability  of  the  network. 


Theorem  4,3  (Global  Stability)  Consider  the 
network  in  Figure  4 *  Assume  that  the  network  is  al¬ 
ways  feasible ,  i.e,,  there  always  exists  a  bounded  power 
vector  P*  satisfying  (16).  Then  the  network  is  glob¬ 
ally  stable  if  for  every  user  i: 

IN^IL-inrfuee^1  (27) 


deviations  of  the  power  levels  of  all  the  users  in  the 
network  from  their  corresponding  optimal  values  will 
always  remain  bounded.  Even  though  the  condition 
(27)  is  only  sufficient  and  might  be  conservative  in 
some  cases,  it  can  still  help  us  design  new  stable  algo¬ 
rithms  and  analyze  the  stability  of  current  algorithms 
under  channel  gain  variations.  We  will  show  this  by 
an  example. 

But  first,  we  want  to  compare  our  result  with  the  one 
presented  in  [4].  It  was  shown  in  [4]  that  if  the  channel 
gains  stay  constant  and  if  the  network  is  feasible  (i.e., 
a  constant  optimal  power  vector  exists)  and  if  the 
interference  function  is  linearized  around  this  optimal 
power  vector,  then  a  sufficient  condition  for  global 
stability  of  the  linearized  network  (in  the  4  -induced 
norm  sense)  is: 

P^h-induced  =  SUP  \°i  (ei“)  |  <  1  (28) 

This  means  that  if  the  power  vector  of  the  network 
deviates  a  little  bit  from  the  optimal  power  vector, 
and  as  long  as  all  the  channel  gains  stay  constant,  the 
power  levels  will  asymptotically  move  back  to  their 
optimal  values.  In  contrast,  in  deriving  the  sufficient 
condition  (27),  no  constant  channel  gain  assumption 
was  made  and  no  linearization  was  involved.  However, 
the  stability  in  —  induced  norm  does  not  imply 
asymptotic  convergence  of  the  small  power  level  devi¬ 
ations  to  zero.  Instead,  it  implies  that  the  deviations 
always  remain  bounded  even  if  the  optimal  power  vec¬ 
tor  changes  due  to  the  variations  in  the  channel  gains. 
Also  (27)  is  sometimes  more  conservative,  since  we  al¬ 
ways  have: 

IN®"1)! U— indeed  ^  II^^IL-in^eerf  (29) 

Example;  Consider  the  integral  algorithm  in  (8) 
with  gain  0,  i.e.,:  ft  (fc)  =  ft (i  - 1)  +  0{%  -  fi(k%  or 
in  linear  scale: 


Proof;  Since  Gi(q~1)  always  incorporates  a  delay, 
it  is  easy  to  see  that  the  operator  ApjG  is  always 
strictly  causal  and  hence  the  closed  loop  system  in 
Figure  4  is  always  well-posed.  Moreover,  the  feasibil¬ 
ity  assumption  guarantees  the  existence  of  a  bounded 
P*.  Therefore,  if  |Ng-1)||,oo_.n(iuced  <  1  for  ev- 
ery  user  i,  we  will  have  I|G(g_1)||^-*nd«ced  <  1  and 
using  Lemma  4,2,  the  global  stability  of  the  network 
will  be  established  simply  by  invoking  the  Small  Gain 
Theorem, 


ft(*)=B(*-l)(^)#. 


(30) 


We  have: 


Giiq-1) 


We  should  first  note  that  for  the  local  loops  to  be 
stable  we  need  to  have  0  €  (0,2).  It  is  now  easy  to 
show  that  for  0  <  0  <  1,  we  have: 


The  above  theorem  states  that  if  the  feasibility  con-  ||tr*(g  MIL  .  .  ,  =  ||Gj(g“i)lL  .  ,  J  =  10 

dition  is  not  violated  and  if  (27)  is  satisfied,  then  the  (32) 


Figure  5:  -  induced  and  £2  —  induced  norms  for  G% 

in  the  one  step  delayed  ease 


and  when  /?  becomes  larger  than  one,  both  induced 
norms  start  increasing.  This  proves  that  not  only  do 
the  power  levels,  obtained  from  the  distributed  itera¬ 
tive  algorithm  in  [1]  (where  j3  =  1  is  assumed),  con¬ 
verge  to  their  optimal  levels  if  the  channel  gains  stay 
constant,  but  also,  under  the  channel  gain  variations, 
the  deviations  of  the  power  levels  from  their  optimal 
values  always  remain  bounded. 


It  is  instructive  to  also  consider  the  case  where  an 
additional  delay  is  assumed  for  processing  and  propa¬ 
gation,  i.e.,  one  step  delay  is  inserted  in  the  feedback 
path  in  Figure  1.  In  this  case: 


Gi(q~l) 


1  +  q-ZKiiq-1) 


for2 

1  -  q-1  +  f}q~2 


(33) 


First  note  that  /?  =  1  will  result  in  closed-loop  poles 
on  the  unit  circle  and  therefore  instability  of  the  local 
loops.  The  ~  induced  and  £2  —  induced  norms  of 
Gi  are  shown  in  Figure  5.  It  can  be  seen  that  in  order 
to  guarantee  the  bounded  deviations  of  the  power  lev¬ 
els  in  the  network  (i.e.,  the  global  stability  in  the  iQ Q 
sense),  we  need  to  approximately  have  0  <  .27.  More¬ 
over,  to  ensure  the  global  stability  of  the  linearized 
system  in  the  £2  sense,  we  need  to  have  f3  <  0,33. 
These  bounds  on  the  gain  are  rather  small  and  could 
therefore  result  in  slow  responses  to  the  changes  in 
SIR  thresholds  or  the  channel  gains.  However,  re¬ 
member  that  the  sufficient  conditions  for  global  sta¬ 
bility  have  been  obtained  under  worst  case  scenarios 
and  therefore  might  yield  conservative  requirements 
in  some  cases. 


5  Conclusion 

We  reviewed  SIR  threshold  approach  for  power  con¬ 
trol  design  in  cellular  wireless  systems.  Then  we 
discussed  the  decentralized  regulator  formulation  for 
power  control  problem.  Using  a  robust  control  frame¬ 
work,  we  obtained  a  sufficient  condition,  which  would 
guarantee  that  the  deviations  of  the  power  levels 


from  their  corresponding  optimal  values  always  re¬ 
main  bounded.  We  then  showed  that  if  no  extra 
delay  is  considered  for  processing  and  propagation, 
the  widely  proposed  integrator  algorithm  does  indeed 
yield  a  globally  stable  network  as  long  as  the  varia¬ 
tions  of  the  channel  gains  do  not  force  the  network 
out  of  its  feasibility  region.  As  future  work,  we  shall 
try  to  actually  quantify  some  bounds  on  the  power 
level  deviations, 
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7  Abstract 

An  LQ  strong  stabilization  problem  is  proposed.  To  determine  when  a  controller  with  periodic  gains  is  locally  superior  to  a  linear  time 
9  invariant  compensator  for  this  problem,  a  il  test  is  presented.  For  systems  with  strictly  proper  transfer  functions,  it  is  proven  that  the 
frequency  range  where  stable  periodic  controllers  based  on  weak  variations  about  the  LTI  case  can  give  better  performance  than  stable  LTI 
1 1  compensators  is  finite.  In  the  development,  a  means  to  evaluate  the  second  partials  of  functions  with  respect  to  matrix-valued  parameters  is 
introduced.  For  those  systems  where  periodic  control  is  warranted,  techniques  for  designing  optimal  periodic  strongly  stabilizing  controllers 
13  are  presented.  Two  examples  detailing  the  application  of  the  IT  test  are  provided,  as  well  as  ah  optimal  periodic  controller  design  example. 

©  2003  Published  by  Elsevier  Science  Ltd. 

15  Keywords:  Optimal  control;  Chattering;  Stability  properties;  LQG  control;  Periodic 


1.  Introduction 

17  Often  it  is  desired  that  an  output  feedback  controller  not 
only  stabilize  a  plant,  but  be  stable  itself.  The  process  of 
19  designing  such  a  controller  is  referred  to  as  the  strong  sta¬ 
bilization  problem.  It  has  recently  been  shown  that  all  lin- 
21  ear  time-invariant  (LTI)  systems  that  are  both  detectable 
and  controllable  can  be  strongly  stabilized  by  periodic  con- 
23  trailers  (Savkin  8c  Petersen,  1998).  The  proposed  controller 
design  consists  of  a  full  state  controller  that  during  a  pe- 
25  riod  of  length  T  operates  without  any  measurements  upon 
a  propagated  state  estimate.  At  the  end  of  the  period,  this 
27  state  estimate  is  updated  by  a  Luenberger  estimator. 

This  method  has  some  drawbacks,  however.  The  period 
29  must  be  longer  than  a  minimum  length  To  to  ensure  strong 
stabilization,  and  the  gain  of  the  controller  between  the  pe- 
31  riodic  updates  affects  the  size  of  To .  Because  a  large  period 
33  implies  poor  performance  in  the  presence  of  disturbances, 


*  A  portion  of  this  paper  was  presented  in  August  2001  at  the  IF  AC 
Workshop  on  Periodic  Control  Systems  in  Cemobbio-Como,  Italy.  This 
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(J.D.  Wolfe),  speyer@seas.ucla.edu  (J.L.  Speyer). 


To  must  be  kept  reasonably  small,  but  reducing  To  requires 
high  controller  gains.  Also,  it  is  worrisome  from  a  robust-  35 
ness  standpoint  that  the  controller  runs  open  loop  over  each 
period.  We  will  demonstrate  that  the  disturbance  rejection  37 
capability  of  a  stable  continuous  feedback  controller  is  con¬ 
siderably  better.  39 

The  primary  contribution  of  this  paper  is  a  cost  function 
formulation  that  induces  strong  stability.  Because  this  cost  is  41 
non-convex,  it  provides  an  opportunity  for  periodic  strongly 
stabilizing  controllers  to  produce  a  lower  cost  than  strongly  43 
stabilizing  LTI  controllers.  Before  designing  a  strongly  sta¬ 
bilizing  controller,  however,  it  is  wise  to  investigate  the  fol-  45 
lowing  related  question:  If  we  restrict  ourselves  to  consider¬ 
ing  only  observer-structure  controllers,  and  require  the  con-  47 
trailer  to  be  stable,  when  can  a  control  with  periodic  gains 
potentially  reduce  the  cost  function  compared  to  one  with  49 
fixed  gains?  To  answer  this  question,  we  construct  a  il  test 
(Bittanti,  Fronza,  &  Guardabassi,  1 973;  Bernstein  &  Gilbert,  5 1 

1980)  that  indicates  when  small  periodic  variations  from  the 
best  time  invariant  controller  may  improve  the  cost  func-  53 
tion.  Of  interest  in  its  own  right  is  the  procedure  we  develop 
for  converting  problems  where  the  optimization  parameters  55 
are  gain  matrices  into  a  form  amenable  to  application  of 
the  17  test.  Since  a  considerable  number  of  fixed  structure  57 
problems  (including  the  static  output  feedback  problem  and 


0005- 1098/03/$ -see  front  matter  ©  2003  Published  by  Elsevier  Science  Ltd. 
doi:  1 0. 1 01 6/S0OO5- 1 098(03 )0G  1 78-X 


AUT  23261  pp:  1—12  (eol.fig,:  nil) 


I  PROD,  TYPE:'FTPl  [  ED:  Viji  FAGN:  Vish  -  SCAN:  Ra]  [ 


2 


J.D.  Wolfe,  J.L  Speyer fAutomatica  III  fllllj  Ill-Ill 


1  several  decentralized  control  problems)  involve  optimizing 
over  gain  matrices,  the  method  derived  here  appears  to  have 
3  many  extensions. 

We  then  develop  techniques  for  designing  periodic  con- 
5  tollers  that  minimize  our  cost  function  and  thus  satisfy 
strong  stability.  It  should  be  emphasized  that  a  periodic  sta- 
7  ble  controller  may  be  determined  even  when  a  static  stable 
controller  does  not  exist.  Furthermore,  although  the  U  test 
9  will  indicate  when  an  LTI  stable  compensator  is  not  a  local 
minimum  (and  that  a  stable  periodic  design  can  outperform 
1 1  it),  failure  of  the  II  test  does  not  imply  that  the  stable  LTI 
design  is  globally  optimal,  so  construction  of  a  stable  peri- 
13  odie  controller  may  still  be  worthwhile. 

This  paper  is  organized  as  follows:  Section  2  formulates 
15  a  new  cost  function  that  penalizes  unstable  controllers.  In 
Section  3,  we  state  some  results  on  the  second  derivatives  of 
17  traces  of  matrix  functions  that  are  interesting  in  themselves 
for  numerical  optimization  of  fixed  structure  controllers.  The 
19  II  test  for  strongly  stabilizing  controllers  is  presented  in 
Section  5,  and  it  is  shown  that  when  a  plant  transfer  func- 
21  tion  is  strictly  proper,  gains  based  on  weak  variations  from 
the  static  gains  can  reduce  the  cost  function  below  the  cost 
23  with  an  LTI  controller  only  over  a  finite  frequency  range. 

Section  6  describes  the  process  of  optimal  periodic  con- 
25  toller  design.  The  II  test  is  applied  to  example  systems  and 
an  optimal  periodic  controller  is  demonstrated  in  Section  7. 
27  Section  8  concludes  the  paper. 

2.  The  optimal  control  problem  with  a  strong 
29  stabilization  constraint 


Gaussian  controller  45 

dx  =  Axdt  +  Budt  -j-  L(dy  -  Cx  —  Du)dt,  (5) 


u  =  —Kx, 


(6) 


where  K  is  the  linear-quadratic  regulator  gain  and  L  is  the 
gain  of  the  Kalman-Bucy  filter.  47 

Observe  that  if  we  define  e  =  x  -  x,  the  closed  loop 
dynamics  and  cost  can  be  rewritten  as  49 


A-BK 


BK 


zlqg  = 


L 

0 

A- 

LC 

r, 

0 

+ 

dw. 

A 

-LA 

1/2 

0 

-r\k 

RmK 

e 

dt 


1  f** 

J[Kt  L,  x]  =  j—E  j  (zJqqZlqg  )  dt 


O) 

(8) 

(9) 


Note  that  the  dynamics  of  file  controller  are  described  by 

Ac  4  A-  BK  -LC  +  LDK,  (10) 

and  that  this  matrix  need  not  be  Hurwitz.  5 1 

Suppose  we  were  to  add  a  cost  term  that  would  penalize 
an -unstable  controller.  If  we  constrained  the  controller  to  53 
have  the  same  observer  structure  as  before,  file  dynamics 
and  dost  would  look  like  55 


Consider  the  Gauss-Markov  time-invariant  system  de- 
31  scribed  by 


33 


dx  =  (Ax  +  Bu)dt  +  [rt  0]d w. 


(1) 


dy  =  (CxA‘Du)dt  +  [0r2]dwi  ,  (2) 

where  x  6  K*,  y€Mp,  u£  Um  and  we  Uq  is  a  Brownian 
motion  process  whose  independent’ increment  processes  d w 


dxc\  =  Ac\xc\  dt  +  #ei  dw, 

Z  =  Cc\Xcu 

J[K,L,t]=  lim  Is  \f  (zTz) dt 

k  *00  KX  I  Jo 

with 


have  the  statistics 

£[dwdwT]  =  /dz. 

E[dw] 

=0, 

L[dwdwT]  =1  it,  £[dw]  =  0, 

(3) 

r 

A  -  BK 

BK 

0  ' 

35 

where  £[*]  indicates  the  expectation  operator  and  /  indicates 

Ac\  = 

0 

A-LC 

0 

the  identity  matrix.  Without  loss  of  generality 

f,  r2  is  assumed 

37 

to  have  full  row  rank.  Our  cost  function  is  file  expectation 

0 

0 

Ac . 

of  the  quadratic  cost  function  suggested  in  Bittanti  et  al. 

FA  o 

A  “1 

39 

(1973): 

0 

J[u,  t]  =  lim  -j-E]  f  (xtQx  +  urRu)  df 
k  *oo  kt  |  Jo 

Bc i  = 

A  -LA  0 

5 

,  (4) 

L  0  0 

A. 

where  r  is  file  period  of  a  cycle,  h  is  the  number  of  cy¬ 

'  0,/2 

0 

0 

41 

cles,  Q  is  a  symmetric  nonnegative  definite  matrix  and  R 
is  a  symmetric  positive  definite  matrix.  The  answer  to  this 

Cd  ■ 

-RmK 

Rl/2K 

0 

qT 

43 

optimization  problem  is  the  well-known  linear  quadratic 

0 

0 

v-Tjrjh 


(11) 

(12) 

(13) 


(14) 

57 
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I  where  the  new  state  Xf,  with  the  dynamics  of  the  controller, 
is  forced  by  noise  and  included  in  the  cost  function  via  the 
3  weight  Qf.  To  insure  that  all  controller  states  are  penalized 
in  the  cost,  it  is  required  that  Qf  be  positive  definite  and  that 
5  Pf  have  full  row  rank. 

The  cost  expression  above  can  be  written  in  terms  of  the 


7  covariance  of  xcu  P  —  [*ci*J]: 

J  =  lim  tr  j-  P{PCdt)TCdt)}dt,  (15) 

*->oo  KX  Jo 

subject  to 

P  =  AcX(t)P  +  PAdtf  +  Bc](t)Bc](t)\  (16) 

P>  0,  (17) 

9  where  tr  denotes  the  trace  operation.  Let  us  partition  P  into 
n  x  n  pieces  as  follows: 

P\  P 12  P 13 

P=  Pj2  P2  P23  ■  (18) 

A 3  Pi  Pi. 

11  An  equivalent  expression  of  the  cost  is  then 
1  f ** 

J  -  lim  tr  —  /  {PxQ 

*-* OO  KX  J0 

+  (Px  -  Pn  -  Pj2  +  P2)KTRK  +  P3Qf} d t.  (19) 


We  can  write  a  Hamiltonian  for  this  optimization  problem 
13  in  the  usual  way 

J?  =  tr{PxQ  +  (P{  -  PX2  -  Pj2  +  P2)KtRK 

+  PiQf  +  A(AdP  +  PAl  +  BclBl )},  (20) 

where  is  almost  identical  to  the  Hamiltonian  used 
15  in  Denham  and  Speyer  (1964)  and  is  similar  to  the  one 
(for  a  case  with  no  process  noise)  used  in  Athans  (1968). 
17  Following  the  standard  derivation,  the  necessary  conditions 
for  minimizing  J  are:  r  ; 

19  Theorem  1  (Pontryagin’s  necessary  conditions).  The 
following  are  necessary  for  minimizing  J: 

21  (1 )  Jf  is  minimized  with  respect  to  K  and  L , 

(2)  JfP  =  —dA/dt,  A(kt)  =  0,/or  k  =  0, 1, 2, , 

23  (3)  A  =  dP/dt. 

If  Jf  has  a  minimum  and  is  continuously  differentiable 
25  in  K  and  L,  a  necessary  condition  for  minimizing  Af  is 
jfK = =0.  If  we  partition  A  in  the  same  manner  as  P  was 
27  partitioned  in  (18)  and  assume  that  there  is  a  steady-state 
stationary  solution  to  file  optimization  problem,  then  the 
29  following  equations  are  satisfied  at  the  stationary  point: 

JfK=0  =  2 RK(Pi  ~ Pn  ~  Pn  +  Pi) 

+  2DTLT(ATnPn  +  4^23  +  A3P3) 


I Automatica  III  (1111)  Ill-Ill  3 

+2BT(-AtPi  +  AlPl2-AnPj2 


+  A\2P2  —  A\3P\3  —  AnPl 
+  A\3Pj3  —  Af3P23  —  A3P3),  (21 ) 

jfi  =  o= 2  A2Lr2rJ + 2(43p13  +  4^23  +  a3p3)ktdt 

-  2(A]2Pn  +  A\pp  13  +  A2P2 
+  A23P 23  +  A23P 23  +  A3P3)Ct,  (22 ) 

-  dA/dt  =  0  =  AclP  +  PAl  +  BciBl,  (23 ) 

dPjdt  =  0  =  A\A  +  AAC\  +  CjCd.  (24) 


2. L  When  is  there  a  steady-state  stationary  solution  to 
the  optimization  problem 

The  conditions  for  determining  when  an  LTI  system  may 
be  stabilized  by  a  stable  controller  were  found  in  Youla, 
Bongiomo,  and  Lu  (1974),  The  following  extension  of  these 
conditions  to  the  MIMO  case  can  be  found  in  Vidyasagar 
(1985): 

Theorem  2  (Parity  interlacing  property).  Let  C+e  de¬ 
note  the  extended  right  half  of  the  complex  plane 
(fs  €  €:  Re(S)  >  0},  together  with  positive  infinity ),  A 
plant  P  is  strongly  stabilizable  if  and  only  if  the  number 
gf  poles  of  P  (counted  according  to  their  McMillan  de- 
i  grees)  between  any  pair  of  real  C+e-blocking  zeros  of  P 
x ffeven. 

Note  that  the  stable  compensator  that  stabilizes  the  system 
in  the  above  theorem  is  a  proper  matrix  transfer  function  of 
arbitraiy  order — i,e.  a  strictly  proper  stabilizing  compensator 
with  the  same  order  as  the  plant  may  not  exist.  However, 
there  are  constructive  sufficient  conditions  for  stable,  strictly 
proper  full-order  compensation  (Wang  &  Bernstein,  1994), 
If  such  a  compensator  can  be  found,  it  can  be  used  as  a 
starting  point  for  an  iterative  scheme  to  find  a  stationary 
point  of  our  optimization  problem  (Geromel  &  Bemussou, 
1979;  Toivonen  &  Makila,  1985,  1987). 

3.  Some  results  on  second-order  derivatives  of  traces 

The  sequel  will  require  some  results  on  second-order 
derivatives  of  traces  of  matrix  functions.  The  proofs  are 
substantially  the  same  for  each  case,  so  the  proof  for  one 
representative  case  has  been  included  in  Appendix  A. 
Each  of  the  other  assertions  can  be  proven  using  similar 
arguments, 

Propositions,  Let  XtY,A,B  be  complex  matrices  of 
appropriate  dimension.  Denote  the  (ij)th  component 
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of  a  matrix  by  (  )tj.  Then 


tr {XAY^BYC)  =  cu[B^YA\j  +  [BYC^ap. 

^ 2 

'dxkidx-  =  bkiajt  +  bjfrdij. 

tr(XrAYrBYC)  =  au[BYC]kj  +  [ B^YA^Cj. 


4.  Products  that  convert  linear  matrix  equations  into 
linear  vector  equations 

4.1,  Review  of  the  Kronecker  product 

The  following  well-known  results  can  be  found  in  elemen¬ 
tary  linear  algebra  texts  (e.g.  Lancaster,  1969,  Chapter  8): 

Definition  4  (Kronecker  operator).  Let  IF  denote  a  field. 
If  A  £  Fmxn  and  B  £  Foxp,  then  the  Kronecker  operation 
on  A  and  B,  written  A  0  2?,  is  an  mo  x  np  matrix  whose 
elements  are  defined  by  the  relation  {A0B\t  =a„-%,  Where 
k  =  (r-l)o  +  s,  t  =  (i-l)pFj. 

Proposition  5  (Kronecker  product).  If  A<GjFmXn  and 
B  £  IF  oxp,  then  the  Kronecker  operation  A  0  B  is  a 
well-defined  product. 

Proposition  6  (Kronecker  product  and  linear  matrix  equa¬ 
tions).  Consider  the  following  matrix  linear  equation 
for  the  unknown  matrix  X£&rnxn:  AXB  —  C  where 
A,B,Ce  We  can  consider  this  equation  as  an  abbre¬ 
viation  for  n2  scalar  equations  for  the  n2  elements  ofX .  Let 
us  define  the  “ vectorized *  versions  of  X  and  C  in  IF  ni  by 

xT  =  [Xl  Xl  . .  •  Xlf,  cT  =  [Cl  Cl  •  •  •  CjjT, 

where  Xim)  Cj *  denote  the  ith  row  of  X  and  the  jth  row  of 
€,  respectively .  Then  the  equation  AXB  =  C  is  equivalent 
to  Gx  =  c  for  some  G  £  IF  nixni.  One  can  easily  verify  that 
G=A0  BJ. 


4.2.  Another  product 

Recall  that  the  matrix  equation  AXB  =  C  can  be  trans¬ 
formed  to  the  form  (/t  05T)x=c,  where  x  and  c  “vectorize” 
X  and  C  by  rows.  Now,  suppose  we  wished  to  express  the 
matrix  equation  AXrB  =  C  as  Gx = c,  where  x  and  c  are  the 
same  as  before.  Motivated  by  this  problem,  we  will  define 
a  new  operator. 

Definition  8  (KT-operator).  Let  IF  denote  a  field.  If 

A  £  !Fmxn  and  B  £  Fox p  then  the  KT-operation  on  A  and 
T 

B ,  written  A0B,  is  defined  element-wise  by 

T 

{Am\i  =  arJbsh 


where  k  =  (r  -  l)o  +  5  and  /  =  (?  -  1  )n  +  j. 


Proposition  9  (KT-product).  If  A  £  Fmxn  and  B  £  F QXp , 

T 

then  the  KT-operation  A®B  is  a  well  defined  product. 


Proposition  10.  Let  Ce^mxn,  A€^mxo,  X€&pxo, 
B  e  2?  py.n-  Let  AXtB  =  C  be  a  linear  matrix  equation  in 
X.  “Vectorize”  X  and  C  as  follows : 


x^  =  [XlXl  ...  Xlf,  C  =  [Cl  Cl  •••  Clf. 


.T  -  rrJ 


Then  AXTB  =  C  is  equivalent  to  the  equation  (A0BT) 
x  ==  c. 


The  proofs  of  the  above  propositions  are  trivial  modifi¬ 
cations  of  the  corresponding  proofs  for  the  Kronecker 
product  case. 


Remark  11  (Relationship  between  Kronecker  product  and 
KT  product).  If  X  is  a  matrix  and  the  column  vector  x  is 
X  vectorized  by  columns,  then  there  exists  a  permutation 
matrix  S,  whose  elements  are  all  either  0  or  1,  such  that  Sx 
is  X1  vectorized  by  columns.  Then  AX1B  —  C  is  equivalent 

to  both  (A  0  Br)Sx  =  c  and  (A0BT)x  =  c. 

Using  the  KT  product  is  preferable  to  using  the  Kronecker 
product  and  a  permutation  for  two  reasons:  the  notation  is 
more  compact,  and  the  operation  count  is  lower  (the  op¬ 
eration  count  for  computing  a  KT  product  is  the  same  as 
that  required  for  a  Kronecker  product,  while  multiplying 
permutation  matrices  is  costly  due  to  the  large  size  of  the 
matrix). 


5.  Constructing  a  17  test 


Proposition  7  (Kronecker  product  of  positive  definite  Before  the  17  test  can  be  constructed,  we  must  first  obtain 

matrices).  If  A  and  B  are  two  positive  definite  matrices ,  expressions  for  the  partial  derivatives  of  XT  and  for  the 

then  A0B  is  also  positive  definite .  linearized  equations  of  motion. 
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5.1.  Partial  derivatives  of 


instance. 


Recall  the  Hamiltonian  for  our  optimization  problem  is  ®  ®  (A  BK  )]<5pi 

Jf  =  tr{P,2  +  (P,  -  P, 2  -  Pj2  +  P2)KtRK  +  [(BK°)h  +  /  ®  (PAr0)]<5p12 


+P30f  +  A(AclP  +  PA\  +  BolBl)}.  (25) 


Note  that  since  P  appears  linearly  in  82ytl(8 pki8 Pij)= 0 

¥iJ,kfL 

The  second  partials  of  can  be  found  using  die  results 
on  second  partials  of  traces  with  respect  to  matrices  devel¬ 
oped  in  Section  3.  These  results  can  be  expressed  in  a  more 
compact  notation  if  we  “vectorize”  the  parameter  matrices 
and  write  our  results  in  terms  of  Kronecker  products  and 
KT-products.  As  an  illustrative  example,  we  will  derive  the 
second  partial  of  3ft  with  respect  to  K. 

Using  the  formulas  in  Section  3,  one  can  find  that 


82 

8kghdkef 


2t(P,K,L,A) 


~Pge[Pl  ~  P\2— Plifh 


+  B-eg\P\  ~  P.2  -  Pj 2  +  Pl\f.  (26) 


Let  8K  be  a  small  variation  in  K.  Vectorize  8K  by  rows, 
Le.  Sk(q-  i)„+r  =  5Kqr.  Define  3Hf(X,K,L,A)],y  as  follows: 

8krJP’(P,K,L,  /l)itk5k . 


■Y.Y.Y.'Z.M*' 

0=1  h=l  e=\  /=. 


fijf(P,K,L,A) 
dkghdkef 


SKef.  (27) 


Then,  using  what  we  know  about  Kronecker  products  and 
Eqs.  (27)  and  (26)  we  have  /  " 

3?(P,K,L,A))tk  =  2R  ®  [Pi  -  P.2  -  Pj2  +  Pgi  %#) 

In  an  entirely  analogous  way,  we  can  defmel;  pfrpi2vPi3>  P2* 
P23,P3  with  respect  to  L  and  P.  Then  ^fkl?  ^kpj3  ^kp52? 
3ft kp13s  kp2?  3ft kp^j  3ft kp3,  u,  3ft  jpj ,  ftt lpj2 ,  3ft ipu,  3ft ip2 , 
JtiPn,  jfjP3  can  be  determined  in  terms  of  Kronecker  and 
KT-products  of  the  system  matrices.  These  expressions  are 
given  in  Appendix  B. 

5,2.  Linearization  of  the  equation  of  motion 
The  covariance  P  satisfies  the  differential  equation 


+  [  -  B  ®  P?  +  B  ®  P?2  -  P\®B  +  P?2®S]<5k. 

(30) 

The  state  space  equations  for  this  and  the  other  5pf’s  (which 
are  given  in  Appendix  C)  can  be  put  together  into  a  large 
linear  system 

Sp  =  FS$  +  G[dk5l].  (31) 

A  transfer  function  from  the  parameter  variations  8k  and  81 
to  the  states  <5p  can  then  be  computed  in  the  standard  way: 

<5p(j)  =  (si  -  F)~lG[5k(s)  <51(5)].  (32) 


5.3.  The  II  test 

We  will  now  create  a  H  test  for  the  fixed  structure  strong 
stabilization  problem,  following  the  same  general  strat¬ 
egy  used  in  the  state  feedback  case  (Bittanti  et  al,  1973; 
Bernstein  8c  Gilbert,  1980).  Consider  nonlinear  system  (16) 
and  associated  cost  (15).  Let  (31)  be  the  linearization  of 
the  dynamics  described  in  (16).  Suppose  also  that  we  have 
found  a  set  of  static  control  and  observer  gains  that  meet 
the  first-order  necessary  conditions  for  optimality. 

i  Definition  12.  An  optimal  periodic  control  problem  is  said 
Co  be  proper  if  there  exists  a  period  f  and  an  admissible 
control  gains  K(t%L(t)  such  that 

J[K(t),L(t),T]<J°,  (33) 

where  J°  is  the  cost  corresponding  to  the  optimal 
steady-state  solution  of  the  problem,  using  the  static  gains 
K°,L°.  Hence,  a  strong  variation  in  the  controller  gains 
from  the  steady-state  solution  has  a  lower  cost. 

Note  that  the  term  “proper”  has  historically  been  used 
both  to  describe  the  optimality  of  periodic  optimal  control 
problems  and  to  describe  transfer  functions  that  have  more 
poles  than  zeros.  To  avoid  confusion,  we  shall  always  ex¬ 
plicitly  state  whether  it  is  a  periodic  optimal  control  problem 
or  a  transfer  function  that  is  proper. 


P(t)  =  Ad(t)P(t)  +  P(t)Aa(t)r  +  Ba(t)Bd(t)r.  (29) 

To  linearize  this  bilinear  form,  suppose  that  P0,K°,£°  are 
nominal  solutions  that  satisfy  (29).  Then  take  small  varia¬ 
tions  so  th3iP=P°  +  8PiK^K0  +  8KtL=L°+8L.  We  can 
eliminate  the  higher-order  terms  and  express  the  result  in 
terms  of  “vectorized”  quantities.  This  is  easily  accomplished 
using  the  rules  for  “vectorizing”  matrix  equations  given  in 
the  sections  discussing  the  Kronecker  and  KT  products.  For 


Definition  13.  An  optimal  periodic  control  problem  is  said 
to  be  locally  proper  if  there  exists  a  period  f  and  admissible 
weak  variations  8K(t%  8L(t)  in  the  controller  gains  such 
that 

J[K°  +  6K(t), L°  +  dL(t), f ]  <  J°,  (34) 

where  J°  is  the  cost  corresponding  to  the  optimal 
steady-state  solution  of  the  problem,  using  the  static  gains 
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K°,L°.  Here,  a  weak  variation  in  the  controller  gains  from 
the  optimal  steady-state  solution  yields  a  lower  cost. 

Definition  14.  Let  ( P,K,L )  be  a  steady-state  admissible 
triple.  The  optimal  periodic  control  problem  is  normal  at 
( P,K,L )  if  the  following  condition  is  satisfied  for  some  v. 

rank[(e#T  -I„)GFG  •  •  •  Fn~] G]  =  n.  (35) 

Remark  15.  Note  that  (F,  G)  controllability  is  sufficient  to 
ensure  normality. 

For  convenience,  we  will  drop  the  use  of  functional  no¬ 
tation  for  and  its  derivatives — any  usage  is  assumed 
to  occur  at  the  stationary  point.  Let  us  define  u(s)  — 
[t5k(^)T  (51(s)t]t.  Using  the  techniques  of  the  previous 
subsections,  we  can  construct  Jfsp>  Jfuu,  and  H(s),  where 
H(s)  is  the  transfer  function  from  u (s)  to  p (s).  We  also 
know  that  Jfpp  =  0,  Jfpu  =  Jf  Jp< 


so  since  D  =  0,  Jfuu  is  positive  definite  if  and  only  if  both 
Jtkk  and  ii  are  positive  definite.  We  know  that  R,  /i2,  and 
r2rj  are  all  positive  definite.  The  quantity  [Pi  -P\2—P]2  + 
P%]  must  be  positive  definite,  since  P  is  positive  definite 
and  [Pi  —  P12  —  Pj2  +  Pi]  =  [/  ~  I  G]P[J  -/Of.  Hence 

kk  and  Jfn  are  the  Kronecker  products  of  positive  definite 
matrices,  which  means  they  are  positive  definite  themselves. 
So  n(jco)  converges  to  a  positive  definite  matrix  as  to  -» 
oo,  implying  that  there  is  a  frequency  Q  such  that  IJ(jco)  is 
positive  definite  for  all  to  >  0.  By  the  results  of  the  previous 
theorem,  the  optimal  periodic  control  problem  cannot  be 
locally  proper  for  frequencies  m>  Q.  □ 

We  thus  have  the  interesting  result  that  a  chattering  con¬ 
trol  that  is  a  weak  variation  from  the  static  optimum  can 
never  produce  a  better  cost  than  the  static  optimum  for  any 
plant  with  a  strictly  proper  transfer  function. 

6.  Designing  periodic  optimal  controllers 


Theorem  16,  If  the  local  minimum  of  the  optimal 
steady-state  problem  is  normal  and  the  (m  x  m)  Hermitian 
matrix 

n(w)  =  JPTvuH(jco)  +  H(-jco)J  Jf  pu  +  J?ua  (36) 

is  partially  negative  for  some  to  >  0,  then  the  optimal 
periodic  control  problem  is  locally  proper  {and  hence 
proper ).  Conversely ,  if  the  optimal  periodic  control 
problem  is  locally  proper ,  then  there  exists  co  >  0  such  that 
II(w)  is  not  positive  definite . 

•  ■Y' 

Proof.  The  proof  for  this  theorem  is  the  same  as  that  given 
in  Bittanti  et  al.  (1973)  and  Bernstein  and  Gilbert  (1980), 
where  the  control  input  is  the  vectorized  parameters  u.  CJ 

Corollary  17  (Implications  for  strictly  proper  plants  ).  If 
the  plant  transfer  function  is  strictly  proper  (Le.^D  =  0), 
then  there  is  a  frequency  Q  such  that  the  optimal  periodic 
control  problem  cannot  be  locally  proper  for  frequencies 
greater  than  Q. 


Once  we  have  determined  that  only  periodic  controller 
gains  can  strongly  stabilize  the  system,  or  if  the  J7  test  has 
established  that  periodic  gains  offer  better  performance  than 
static  ones,  it  remains  to  design  these  gains.  We  base  our 
design  methoddlogy  on  standard  optimal  periodic  control 
design  practices,  stich  as  those  in  Speyer  (1996), 

Note  first  that  choosing  periodic  gains  makes  the  system 
matrices  Ac\  and  Bc\  periodic  by  Eq.  ( 14),  This  in  turn  makes 
die  solution  to  the  Lyapunov  equation  (16)  periodic,  with  a 
.period  that  is  the  least  common  multiple  of  the  periods  of 
the  elements  of  Ac\  and  Bc\  (from  the  Lyapunov  Lemma  of 
Bittanti,  Bolzem,  Sc  Colaneri,  1985). 

Hence,  if  we  specify  a  periodic  functional  form  for  the 
gains  K  and  L,  such  as 

N 

K  —  Kq  +  ^2  &k\  sin(tor)  +  Kia  cos  (tot), 

N 

L  =  Lq  +  Lk\  sin (ktot)  -f  Lk2  cos (ktot% 

*= i 


Proof.  The  magnitude  of  H(jco)  must  attenuate  at  high 
frequencies  due  to  the  asymptotic  stability  of  the  stationary 
solution.  Hence,  n(jco)  — >  Jfm  as  co  — »  oo.  This  means  that 
if  the  optimization  problem  satisfies  the  Legendre-Clebsch 
condition,  Il(jco)  must  be  positive  definite  for  large  enough 
to .  Now,  the  elements  of  Jtm  are  given  by 


then  we  can  optimize  cost  (15)  with  respect  to  the  parameters 
<o,{KkuKtaiLk\%Lkl}t  and  the  elements  of  P(0),  This  opti¬ 
mization  is  subject  to  the  constraints  that  F(0)  is  a  positive 
definite  matrix  and  that  P  is  periodic  with  period  t  =  Injto. 
Alternatively,  aperiodic  spline  function  (DeBoer,  1978)  for 
the  gains  can  be  chosen,  with  the  constraints 


^kk  =  2 R®  [Pi  -  Pn  -  Pj2  +  P2], 


Jfn  =  2A2®[r2rJ], 

•Jfu  =  2DTl[P]3Ai2]  +  2Z>T®[P3ri3] 
+  2DTI[Pj3ri23], 


(37)  K(0)  =  K(x),  j(K 

=  —K 
1=0  d* 

(38)  m=m,  jl 

11 

O 

II 

t=x 


and  die  collocation  points  as  optimization  parameters. 
Again,  the  elements  of  P(0)  appear  as  optimization  pa- 
(39)  rameters  and  in  the  constraints  that  P  is  i -periodic  and 
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Frequency  (rad/sec) 


Fig.  1.  Minimum  eigenvalue  of  U{jm )  vs.  m. 


1  P( 0)  is  positive  definite.  The  constraint  that  P(0)  is  positive 

definite  can  be  phrased  in  two  ways: 

3  ( 1 )  Require  the  leading  principal  minors  of  P( 0)  to  be  pos¬ 

itive. 

5  (2)  Parameterize  P(0)  by  its*  decomposition 

P(0)  =  #(0)S(0)4f(0)T,  where  47  is  unit  upper  trian- 
7  gular  and  ^  is  diagonal.  P(0)  is  then  positive  definite  if 
and  only  if  all  diagonal  elements  of  S(0)  are  positive. 
9  The  periodicity  constraints  on  P  are  transferred  to  the 
parameters  °U  and  Q),  whose  differential  equations  are 
U  given  in  Tapley  and  Peters  (1980). 

These  nonlinearly  constrained  optimization  problems  can 
13  be  solved  by  standard  methods  such  as  Sequential  Quadratic 

Programming  (Wilson,  1963;  Boggs  &  Tolle,  1996)  or 
15  accelerated  gradient  projection  (Speyer,  Kelley,  Levine,  & 
Denham,  1971).  e 

17  Finally,  note  that  there  is  ho  restriction  in  the  above  prob¬ 
lem  formulations  requiring  existence  of  a  static  solution.  But 
19  nonlinear  optimization  problems  should  never  be  undertaken 

lightly — the  17  test  indicates  when  it  is  useful  to  attempt  the 
21  difficult  process  of  time-varying  controller  generation  if  a 
strongly  stable  LTI  solution  has  already  been  found. 

23  7.  Examples 

7.7.  77  test  for  a  plant  with  a  DC  term 
25  Consider  the  linear  system  and  cost  given  by 
A  =  1,  B=  1,  C  =  1.5,  2>=1,  Ti  =  1, 

72  =  1,  Q=  1,  R=U  ft  =  0.01,  JY  =  1. 


Note  that  the  open  loop  transfer  function,  (s  -f  0.5)/(s  —  1 ), 
meets  the  parity  interlacing  property  (Youla  et  al.,  1974)  27 

and,  therefore,  the  plant  may  be  stabilized  by  a  stable  lin¬ 
ear  time  invariant  controller.  However,  the  resulting  con-  29 
vehtidhal  LQG  controller  is  unstable. 

A  static  solution  for  the  modified  cost  given  by  (13)  was  31 
found  using  the  methods  in  Toivonen  and  Makila  (1985). 

The  results  of  the  local  optimization  were  K°  —  3.9112,  33 

IP  =  1.1774.  The  pole  of  tiie  static  optimal  controller  was 
then  -0.0724.  The  static  optimum  gains  were  also  calcu-  35 
lated  for  several  other  values  of  R.  The  77  test  was  then 
performed  for  each  cost  function  and  corresponding  static  37 
optimal  controller.  For  each  case,  the  minimum  eigenvalue 
of  77  is  plotted  vs.  frequency  in  Fig.  1.  Note  that  when  39 
R  =  1,  the  minimum  eigenvalue  of  77  is  never  negative. 
Hence,  there  is  no  instance  at  which  a  lower  cost  can  be  41 
realized  via  periodic  gains.  However,  if  R  is  reduced,  the 
cost  may  potentially  be  reduced  below  the  static  optimum  43 
value.  When  R  =  0.3,  the  minimum  eigenvalue  of  77  falls 
below  0  for  frequencies  between  2  and  10.5  rad/s.  If  R  45 
is  reduced  to  0.2,  the  minimum  eigenvalue  of  77  is  nega¬ 
tive  for  all  frequencies  greater  than  2  rad/s,  which  means  47 
a  chattering  solution  may  reduce  the  cost  below  that  of 
the  static  optimum.  Note  that  the  plant’s  transfer  function  49 
is  not  strictly  proper,  so  the  restriction  on  the  optimality 
of  high  frequency  gains  given  by  Corollary  17  does  not  51 
apply. 

7.2.  77  test  for  a  flexible  structure  *  53 

The  problem  of  positioning  the  tip  of  a  flexible  robot 
arm  using  only  sensors  and  actuators  at  the  base  of  the  arm  55 
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Fig,  2.  Minimum  eigenvalue  of  vs.  co. 
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11 


13 


can  be  described  by  the  following  linear  system  and  cost 
parameters: 


'  0 

1 

0 

o' 

'O' 

-1 

0 

1 

0 

1 

A  = 

0 

0 

0 

1 

.  B  = 

0 

_  1 

0 

-1 

0 

0 

“0“ 

“0 

0 

0 

o' 

0 

0 

0 

0 

0 

A  = 

0 

.  Q  = 

0 

0 

1 

0 

1 

0 

0 

0 

0 

C  —  [1  0  0  0],  D  =  0 ,  r2  =  i, 

R=  io~3,  Qt  =  io_2/4,  rf =/4, 

where  /4  denotes  the  fonr-dimensional  unit  matrix.  The 


open-loop  transfer  function  for  this  system  is  ((s  +  j) 
(s  —j))/(s2(s  4- 1.4142/ )(5- 1.4142/)),  which  satisfies  the 
parity  interlacing  property  (Youla  et  al.,  1974).  This  plant 
may  thus  be  stabilized  by  a  stable  LTI  controller.  Despite 
this,  the  LQG  gains  yield  an  unstable  controller. 

A  static  solution  for  the  modified  cost  given  by  (13)  was 
found  using  the  methods  in  Toivonen  and  Makila  (1985). 
The  strongly  stabilizing  results  of  the  local  optimization 
were 


£  =  [8.1188  2.0586  -3.7766  4.9878], 

L  =  [7.8756  6.0895  5.0344  -  1.7341)T. 


The  JJ  test  was  then  performed;  Fig.  2  plots  the  minimum 
eigenvalue  of  II  vs.  frequency.  Note  that  die  high-frequency 
behavior  is  as  Corollary  17  predicts,  and  that  the  opti¬ 
misation  problem  is  locally  proper  only  across  a  very 
narrow-frequency  band. 

7i3.  Periodic  optimal  control 

Strongly  stabilizing  periodic  optimal  controllers  were 
generated  for  the  one-dimensional  plant  described  at  the 
beginning  of  the  section  with  parameters  Q  —  1,  R  =  0.2, 
using  the  methods  described  in  Section  6.  The  performance 
of  these  controllers  can  be  evaluated  by  comparing  them 
to  standard  LQG  optimal  controllers,  because  the  optimal 
LQG  cost  when  the  strong  stability  constraint  is  removed 
forms  a  lower  bound  for  the  cost  that  a  strongly  stabilizing 
controller  can  achieve.  Using  Sequential  Quadratic  Pro¬ 
gramming,  we  calculated  strongly  stabilizing  optimal  values 
for  K  and  L  when  they  were  each  parameterized  by  either 
three  elements  in  a  harmonic  series  or  by  10  collocation 
points  for  a  cubic  spline  spaced  equally  in  a  period.  Table  1 


Table  1 

Efficiency  comparison  of  compensators 


Controller  gain  type 

Cost  by  Eq.  (9) 

Cost  by  Eq.  (13) 

LQG 

3,6544 

Undefined 

Static  gains 

4.0444 

4,0992 

Three  harmonics 

3.9636 

4.0054 

10  point  spline 

3.9295 

3.9685 

15 

17 

19 

21 

23 

25 

27 

29 

31 
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■  cubic  spline 
three  harmonics 


Time  (seconds) 


Fig,  3.  Controller  gain. 


cubic  spline 
three  harmonics 


Time  (seconds) 


Fig.  4.  Estimator  gain. 


compares  the  costs  achieved  by  these  compensators  with 
the  costs  achieved  by  the  strongly  stabilizing  static  gains 
and  the  unstable  LQG  compensator.  Note  that  the  cost 
corresponding  to  the  spline-parameterized  strongly  stabi¬ 
lizing  controller  was  29%  closer  to  the  lower  bound  LQG 
cost  than  that  of  the  optimal  strongly  stabilizing  static 
controller. 

Figs.  3  and  4  plot  the  optimal  values  of  the  controller  gain 
K(t)  and  the  estimator  gain  L(i)  over  two  periods,  where 


the  gains  are  parameterized  both  by  a  spline  with  10  col¬ 
location  points  and  by  three  harmonics  of  a  Fourier  series. 
Note  that  the  shape  of  the  gains  are  approximately  the  same 
for  both  parameterizations.  More  interestingly,  observe  that 
when  the  value  of  the  controller  gain  is  large,  the  value 
of  the  observer  gain  is  small,  and  vice  versa.  The  optimal 
strongly  stabilizing  periodic  controllers  thus  oscillate  be¬ 
tween  controller-dominant  and  observer-dominant  phases  in 
a  smooth  manner. 
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1  The  smoothness  in  the  variation  of  the  cubic  spline  con¬ 
troller’s  gains  appears  to  enhance  disturbance  rejection  rel- 
3  ative  to  controllers  constructed  using  die  methods  in  Savkin 
and  Petersen  (1998).  Fig.  5  illustrates  die  closed-loop  re- 
5  spouses  to  the  disturbance  sin(t)  for  both  types  of  controller. 

Note  also  that  the  Savkin-Petersen  controller  for  this  exam- 
7  pie  was  chosen  with  the  smallest  possible  sampling  period 
that  would  guarantee  strong  stability.  When  larger  sampling 
9  periods  were  used,  the  Savkin-Petersen  closed-loop  system 
exhibited  even  larger  deviations. 


11  8.  Conclusion 

A  il  test  applicable  to  a  linear-quadratic-Gaussian  strong 
1 3  stabilization  problem  has  been  developed,  determining  when 

periodic  coefficients  in  the  gain  matrices  can  potentially  re- 
15  duce  the  cost.  One  important  restriction  to  the  test  is  that 
a  stable,  strictly  proper  controller  of  plant  order  must  be 
17  found  to  ensure  the  existence  of*  a  strongly  stabilizing  static 
solution.  Obviously,  df  no  static  solution  exists,  the  optimal 
19  strongly  stabilizing  controller  is  time  varying. 

Techniques  were  then  developed  for  synthesizing  opti- 
21  mal  periodic  strongly  stabilizing  controllers.  Because  these 
techniques  are  computationally  intensive,  the  17  test  is  valu- 
23  able  for  determining  in  advance  whether  a  periodic  con¬ 
troller  may  improve  performance.  An  example  demonstrated 
25  that  a  strongly  stable  periodic  optimal  controller  generated 
with  our  methods  rejected  persistent  disturbances  better  than 
27  competing  methods. 

Methods  used  to  derive  the  17  test  in  this  can  also  be  ap- 
29  plied  to  other  control  problems.  The  material  in  Sections 


3  and  4  enables  many  extensions  to  the  work  in  Athans 
(1968)  and  Dedham  ami  Speyer  (1964)  on  minimization  31 
of  functions  dependent  on  matrices.  In  particular,  the  tech¬ 
niques  used  here  can  be  trivially  modified  to  deal  with  33 
problems  involving  optimizing  decentralized  controllers  for 
systems  with  fixed  modes  (Wang  Sc  Davison,  1973).  35 
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Appendix  A.  Constructing  second  partials  of  trace  37 

functions 

The  proofs  of  the  assertions  in  Proposition  3  all  fol-  39 
low  the  same  general  form.  Therefore,  only  the  construc¬ 
tion  of  {d2 jdykidxij)tx{XAY1  BYC)  is  provided  here.  Each  41 
of  the  other  assertions  may  be  constructed  using  similar 
arguments,  43 

Let  X  have  dimension  in  x  n  and  let  Y  have  dimension 
o  x  p.  We  know  from  Athans  (1968)  that  45 

8  r 
—  tt(XAYJBYC) 

8xtj 

=  "r —  tc(AYJBYCX) 

dxtj 

=  [{AY1BYCflj  =  [CJY1BJYA1lj.  (A.1 ) 
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1  Now,  [Y1  B1  Y]st  can  be  expressed  as  £g=i  Er=i  y<pbrqyrt, 

SO 

p  m 

[CTYrBTYA\  =  ^^c„[rT5T715,fl;( 

S—  1  *=1 

“  J3  53  1  53  y&^rqyrt  J  ajt*  (A.2) 

5=1  *=1  \  ?=1  r=l  / 

3  Hence 

— tr(X4 yT5FC)  =  f-[CTY1B1YA1lj 
oykidxij  dyki 

mo  P  P 

=  ^  ]  Cu  ^  ^  brkyrt&ji  +  ^  ^  C5*  ^  ]  fq^kq^jl 

i=\  r=l  5=1  f=l 


J?LPt=  0,  (B.10) 

lp2  =  -A2®C  -  A2  ®  C,  (B.  1 1 ) 

T  T 

yy [j>^  =  — a3®c  4“  amok] 

-  A3  ®  C  +  As  ®  [DK],  (B.12) 

^LPn  =  ~2Aj2  ®  C,  (B.13) 

yyLpn  =  -2Aj3  ®  C  +  2Aj3  ®  [DK],  (B.  14) 


yeLPn  =  -2Al  ®  c  +  2aJ3  ®  [dk]  -  2a23®c,  (b.is) 

where  /  denotes  a  n  x  n  identity  matrix. 


=  cu{BtYA\  +  [BYC]Maj,.  (A.3) 

Appendix  B*  Second  partials  of  the  Hamiltonian 
Using  the  definitions  found  in  Section  5,  the  second  par¬ 


tials  of  have  the  following  form: 

J?kk  =  2R®  [Pi  - Pn  - Pj2  +  P2],  (B.l) 

^ii  =  2/l2®[r2rj],  (B.2) 

yf xPi  ~  "h  ®  i 

-[P^i]®/-^!]®/,  (B.3) 

yem  =  [B1AX2]  ®  /  +  [BJAnW 

+  [RK]®I  +  [RK]h,  (B.4) 

=  ~[bta3]  ®  /  -  [bt^i3]®7  ,y. ;;  V 

+  [Dji7A3]  ®  7  +  [DTLTA3j®I,  (B.5) 

yyKPu  =  -2 [RK]  ®I  -  2[RK]®I 

+  2[B1U1]®7  +  2[B'ryl12]®7,  (B.6) 

JYkPu  =  -2[Bt43]  ®  /  +  2[DTLcAj3]  ®  7 

-2[BTAi3]I/,  (B.7) 

yf}CP»  =  2[BTAn]®I  -  2[BTA\3]  ®  7 

+  2  [DTLrAl3]®I,  (B.8) 


Appendix  C.  Linearized  dynamics  of  the  covariance 

In  Section  5,  small  variations  were  made  to  parameters 
in  die  covariance  Lyapunov  equation.  Using  the  properties 
of  the  Kronecker  and  KT  products,  the  expressions  for  the 
small  variations  in  the  covariance  matrix  can  be  written 
compactly  as 

8Pt  =  [(A  -  BK)  ®  I  +  /  ®  (A  -  BK)]5PX 

+  [ifiK)h  +  /  ®  (BK)]SPl2  +  [-B®Pl 
A-B  ®  Pi2  -  Pi®B  +  Pi2®B]SK ,  (C.l } 

SP2  =  [(A-LC)®I  +  I®(A- LC)]5P2 
+  [-7®(P2Ct)-(P2Ct)®/ 

+ /  ®  (Lr2rJ ) + (ir2r|  )®/]<5z,  (c.2) 

SP3  =  [(A  —  BK  —  LC  4-  LDK )  ®  I 

+1  ®  (A  —  BK  —  LC  +  LDK)]SP3 

t 

+  [  —  B  ®  P3  +  (£25)  ®  P3 -P3®B 
+  PMLD)]SK  +  [-7®(P3CT)  +  7®  (P3KtDt) 
-(P3CT)®I  +  (PiKTDr)®I]8L,  (C.3) 

5Pn  =  [(BK)®I]5P2 

+  [(A  —  BK)  ®I  +  I  ®(A—  LC)]5Pn 
"b[  —  B®Pj2  +  B  ®  P-^BK 

+  [  —  (PnCJ)®I]5L,  (C.4) 

8Pn  =  [(A  —  BK)  ®  7  +  7  ®  (A- BK-LC  +  LDK)] 
x8Pn  +  [(BK)  ®  7]«5P23  +  [-B®P[3 


JYkl  =  2DT®[P}3An]  +  2D1  ®[P3A3] 

+  2DJ®[PlA23],  (B.9) 
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39 


+  B  ®  Pj3  -  Pn®B  +  P,30(LD)]SK 
+  [  -  (PnCT)®I  +  (Pl3KTDr)®I]dL,  (C.5) 
SPii  =[ (A-LC)®I  +  I®(A-BK-LC  +  LDK)]5P23 
+  [  -  P23®B  +  P23®(LD)]5K  +  [- 1®  (Pj3CT) 

-  (P2iCr)®I  +  (P23KrDr)®I]dL.  (C.6) 
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SUMMARY 

from  a  generalization  of  the  least-squares 
and  block  other  faults  which  h  ?ng,e  ff lt  called  the  tarSet  fault 

with  a  generalized  least-squares  cost  criterion  which  r  >i  T  dei7ved  frora  soiving  a  min-max  problem 
but  insensitive  to  the  nuisance  faults.  It  is  shown  that  thkfiitw  ^  the  reS.ldua' sensitive  t0  the  target  fault, 
fault  detection  filter  such  that  in  the  limit  where  the'wei»htin°  on^hI°n  ” f*5  Properties  of  tbe  classical 
least-squares  fault  detection  filter  becomes  equivalent  to  the  ml™ U,S.ance.fa !ts  ls  zer0> the  generalized 
a  reduced-order  filter.  Filter  designs  can  b^ohtainerf  fnr  1  iff  mpUt  Pbserver  where  there  exists 
systems.  Copyright  ©  2000  John  Wiley  &  Sons,  Ltd.  b  th  tlme’lnvanant  and  time-varying 

KEY  words:  fault  detection  and  identification;  unknown  input  observer;  worst  case  design;  time-varying 


1.  INTRODUCTION 

am°tmatiC  C°nT01  demands  a  h,«h  deSree  of  system  reliability.  This  requires 

^  mVanan^  su^sPace  which  is  unobservable  to  the  residual  Rp^p-ntlv 

ss  iss  Tr,s  ha,e  b"n  deveiopcd  which  have  i"">r°ved  .o 

uueci  Ldiiiues  ana  applicable  to  time-varying  systems  [2  3] 

m  aldlrT’  3  gTT]  S?  least-^uares  fauit  detection  filter,  motivated  by  Chung  and  Speyer 
[2]  and  Bryson  and  Ho  [4],  is  presented.  A  new  least-squares  problem  wit'h  an  indeScosJ 
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iiliSPs= 

4,  JSS'SS:  is  **«•  -  S«io„  3  Rfl.  m  Section 


2.  PROBLEM  FORMULATION 


Consider  a  linear,  observable  system  with  two  failure  modes  [1,2] 

x  =  Ax  +  Bu  +  F lfll  +  F2(x2 


(lb) 

where  « is  the  control  input,  y  is  the  measurement,  v  is  the  sensor  noise  u,  is  the  tareet  fault  and 

System  maTfces^i B  'clT flVaria*Ies  belonS t0  real  vector  pace's,  xsf.usf  and  ye  W. 
modes  u  and  mo’wJ  th  ^  ^  are.time‘varymg  and  continuously  differentiable.  The  failure 
and  F  model  rt’eT  ,  bme-van™g  amplitude  of  the  failure  while  the  failure  signatures  F, 

Assumntion  o  o  .l?  [  ’  ]‘  Th<5  °UtpUt  seParablIlty  test  is  discussed  in  Remark  1  of  Section  5 

raaua!  in  stody-s,aM  «-  «■*«  — 

Assumption  2.1, 

Fx  and  F2  are  output  separable. 


Assumption  2.2 . 


For  time-invariant  systems,  (C,  A,  F,)  does  not  have 


invariant  zero  at  origin. 
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The  objective  of  blocking  the  nuisance  fault  while  detecting  the  target  fault  can  be  achieved  by 
solving  the  following  min-max  problem: 

T  \  -  Mav  ~  Ilk  -  Cxfv-,)  dr  - 1  ||x(r0)  -  Z0\\k  (2) 

subject  to  (la).  Note  that,  without  the  minimization  with  respect  to  fiu  (2)  reduces  to  the  standard 
least-squares  derivation  of  the  Kalman  filter  [4],  t  is  the  current  time  and  y  is  assumed  given.  Qls 
Qz,  V  and  n0  are  positive  definite,  y  is  a  non-negative  scalar.  Note  that  Qu  Q2,  n0  and  y  are 
design  parameters  to  be  chosen  while  V  may  be  physically  related  to  the  power  spectral  density  of 
the  sensor  noise  because  of  (lb)  [4],  The  interpretation  of  the  min-max  problem  is  the  following. 
Let  /if,  /tf  and  x*(t0)  be  the  optimal  strategies  for  /xu  ji2  and  x(£0),  respectively.  Then,  x*(r|  T,),  the 
x  associated  with  /if,  /if  and  x*(£0),  is  the  optimal  trajectory  for  x  where  t  e[t0,  f]  and  given  the 
measurement  history  Y,  =  {y(r)|  t0  <  x  ^  t}.  Since  ju,  maximizes  y  -  Cx  and  p2  minimizes 
y  ~  Cx,  y  ~  Cx*  is  made  primarily  sensitive  to  and  minimally  sensitive  to  /i2 .  However,  since 
x*  is  the  smoothed  estimate  of  the  state,  a  filtered  estimate  of  the  state,  called  x,  is  needed  for 
implementation.  From  the  boundary  condition  in  Section  3,  at  the  current  time  t,  x*(t|  Yt)  =  x(t). 
Therefore,  y  —  Cx  is  primarily  sensitive  to  the  target  fault  and  minimally  sensitive  to  the  nuisance 
fault.  Note  that  when  Q,  is  larger,  y  -  Cx  is  more  sensitive  to  the  target  fault.  When  y  is  smaller, 
y  ~  Cx  is  less  sensitive  to  the  nuisance  fault.  In  Reference  £2],  the  differential  game  blocks  the 
nuisance  fault,  but  does  not  enhance  the  sensitivity  to  the  target  fault.  In  Section  5,  it  is  shown  that 
the  filter  completely  blocks  the  nuisance  fault  when  y  is  zero  by  placing  it  into  an  invariant 
subspace,  called  Ker  S.  Therefore,  the  residual  used  for  detecting  the  target  fault  is 

r  =  H{y  -  Cx)  (3) 

where  x,  the  filtered  estimate  of  the  state,  is  given  in  Section  3  and 

Ker  H  =  C  Ker  S,  H  -I  -  C  KerS,£(CKerS)TCKerS]-1(CKerS)T  (4) 
Ker  S  is  given  and  discussed  in  Sections  4  and  5. 


3.  SOLUTION 

In  this  section,  the  min-max  problem  given  by  (2)  is  solved  [2,4],  The  variational  Hamiltonian  of 
the  problem  is 


^  2 Cll^i Hep1  l/ksl zQV  ~  Ilk  ~  Cx\\y->)  +  At(Ax  4-  Bti  4-  Fiji !  4-  F2ji 2) 

where  Aef  is  a  continuously  differentiable  Lagrange  multiplier.  The  first-order  necessary 
conditions  [4]  imply  that  the  optimal  strategies  for  jiu  pt2  and  the  dynamics  for  X  are 

Ff^-QtFjX,  (4  =^QzFlA,  X  —  —AtX  —  CTF_1(y  —  Cx) 
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with  boundary  conditions 


-*(fo)  =  n0[x*(f0)-xo],  A(f)  =  0  (5) 

By  substituting  fi*  and  pi*  into  (la),  the  two-point  boundary  value  problem  requires  the  solution  to 


*  |F2Q2Fj-F,QiFr 

X* 

Bu 

cTv~lc  -at 

_k_ 

4- 

_CTF_1y_ 

with  boundary  conditions  (5).  The  form  of  (5)  suggests  that 

A  =  II(x*  —  x)  (7) 

where  n(r0)  =  n0,  x(t0)  —  x0  and  x  is  an  intermediate  state.  By  differentiating  (7),  using  (6), 
adding  and  subtracting  II  Ax  and  CTV~iCx,  the  following  dynamic  filter  structure  results: 

Ilx  =  YlAx  +  UBu  +  CTV~  l(y  -  Cx),  x(t0)  =  x0  (8) 

-  li  =  IL4  +  Arn  +  u^jF2q2fJ  -  -  c1v~lc,  n(£0)  =  n0  (9) 


Since  x*  —  x  at  current  time  t  (5),  the  generalized  least-squares  fault  detection  filter  is  (8).  Note 
that  (8)  is  used  by  the  residual  (3)  to  detect  the  target  fault. 


4.  LIMITING  CASE 

In  this  section,  the  min-max  problem  (2)  is  solved  in  the  limit  where  y  is  zero  [2,5].  When  y  is  zero, 
there  is  no  constraint  on  pi2  to  minimize  y  —  Cx.  Therefore,  the  nuisance  fault  is  completely 
blocked  from  the  residual  which  is  shown  in  Section  5. 

In  the  limit,  the  min-max  problem  (2)  becomes 

If'  1 

min  max  max-  (||Ml|]|r.  -  ||y  -  Cx\\2v-,)dr -- ||x(r0)  -  x0||i  (10) 

?!  x(iQ)  l  Jtn  -  2 

This  problem  is  singular  with  respect  to  pi2.  Therefore,  the  Goh  transformation  [5]  is  used  to  form 
a  non-singular  problem.  Let 


=  P2(s)  ds,  a,  =  x  -  F2$l 

Jt  o 

By  differentiating  cct  and  using  (la)s 

dx  =  Aax  4  Bu  +  Ft^t  +  (11) 
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where  Bj  AF2  f2 .  By  substituting  u1  into  (10),  the  new  min-max  problem  is 


min  max  max 

fit  4>i  «i  (*c ) 


CII/^1  lllr*  -  -  II y  -  CtXiWv-'  +  (y  -  Ca.l)TV~iCF2(f>l 


+  <t>jFjCTV  l(y  -  CaJ]  dr  -  -  |jai(t+)  +  )  -  jE0|^ 


(12) 


subject  to  (11).  If  Ft2CtV  lCF2  fails  to  be  positive  definite,  (12)  is  still  a  singular  problem  with 
respect  to  tf,.  Then  the  Goh  transformation  has  to  be  used  until  the  problem  becomes 
non-singular.  If  F]CrF_  ‘CF2  =  0,  let 


4>z(t)  = 


<t>i(s)  ds, 


«2  =  <Xl  -Bk4> 2 


Then  ct2  =  Acc2  +  Bu  +  Flfil  +  B2<t>2  where  B2  =  ABt  -  Bt.  If  FlCTF-1CF2  >  0,  the  Goh 
transformation  is  applied  only  on  the  singular  part  [6],  The  transformation  process  stops  if  the 
weighting  on  <f>2,  B.C  V  CBU  is  positive  definite.  Otherwise,  continue  the  transformation  until 
there  exists  Bk  such  that  the  weighting  on  <f>k,  Bl.^V^CB,.,,  is  positive  definite.  Then,  in  the 
limit,  the  mm-max  problem  (2)  becomes 


1 

mm  max  max  - 
MO  2 


Cll^illlr1  -  ~  1  y  -  C<xfc|||-.  +  (y  -  CakfV~lCB, 


+  XCTF  \V  -  Cot*)]  dr  -  -  ||afc(t  J)  +  B4>(tt) 


'  *olln„ 


(13) 


subjectj  to  a,  -  Aak  +  Bu  +  F^  +  Bk<j>k  where  B  =  [F2  Bx  B2  Bk.{]  and  $  = 
UPi  <P 2  -•  <pk]  ■  The  mm-max  problem  (13)  can  be  solved  similarly  to  (2).  Therefore,  the 
denvation  [6]  is  not  repeated  here.  The  limiting  generalized  least-squares  fault  detection  filter  is 

Sx  =  SAx  +  SBu  +  [SBk(Bj~lCFV~tCBk-1)~ lBj.lCrV~1  +  CrHTV~1H2(y  -  Cx)  (14) 
where 


-  S  =  S.4  +  ArS  +  S [£*(£*_ i CtF “ 1  CBk -i)~lBj  -  FlQ1FT}S  -  CTBrV~lRC 


(15) 

C-S*-1^*-*CI.F~1c£*-»)"1bI-JCtF_1  and  A  =  A  -  B*(Bj_lCTF'1CB*_1)-1Bj_, 
C  I  C  subject  to  x(to  )  =  x0  and  S(t0+)  =  n0  -  n0£(£  Tn0£) ~ 1 B TII0 .  However,  (14)  cannot 
be  used  because  S  has  a  null  space  which  is  shown  in  Theorem  4.1.  Therefore,  a  reduced-order 
filter  for  (14)  is  derived  in  Section  6. 


Theorem  4.1. 

S[£*-i  £*-2  •••  Bl  F2]  =  0. 


Proof.  The  proof  is  similar  to  Reference  [2]  and  can  be  found  in  Reference  [6], 
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5*  PROPERTIES  OF  THE  NULL  SPACE  OF  S 

In  this  section,  some  properties  of  the  null  space  of  S  are  given.  It  is  shown  that  the  null  space  of 
S  is  equivalent  to  the  minimal  (C,  A)  —  unobservability  subspace  for  time-invariant  systems  and 
a  similar  invariant  subspace  for  time-varying  systems.  Therefore,  the  limiting  generalized  least- 
squares  fault  detection  filter  is  equivalent  to  the  unknown  input  observer  and  extends  it  to  the 
time-varying  case.  The  minimal  (C,  A)-unobservability  subspace  is  a  subspace  which  is  (A  —  LCj- 
invariant  and  unobservable  with  respect  to  (EC,  A  -  LC)  for  some  filter  gain  L  and  projector 
H  [1],  Onejnethod  for  computing  the  minimal  (C,  A)-unobservability  subspace  of  Fz  called 
«T2  here,  is  ©  tT2  [1]  where  #~2  -  B*_2  Bx  FJ  is  the  minimal  (C.  A)-invari- 

ant  subspace  of  F2  and  ir2\  s  the  subspace  spanned  by  the  invariant  zero  directions  of  (C,  A ,  F2). 
Note  that  the  associated  H  is 

H:®-*®,  Ker  H  =  H  =  I  -  CBk-ll(CBk-jrCBk-{]-\CBk-l?  (16) 

Note  that  Ker  H  =  Ker  JT, 

Theorem  5,1  shows  that  the  null  space  of  S  is  a  (€,  A)-invariant  subspace.  Theorem  5.2  shows 
that  the  null  space  of  S  is  contained  in  the  unobservable  subspace  of  (EC,  A  —  LC), 

Theorem  5.L 

Ker  S  is  a  (C,  A)-invariant  subspace. 

Proof,  The  dynamic  equation  of  the  error,  e  =  x  —  x,  in  the  absence  of  the  target  fault  and 
sensor  noise  can  be  obtained  by  using  (1)  and  (14): 

Si  =  ISA  +  SBM-t^V^CB^r^l-^V^C  A  CTHTV~1HC]e 
because  SF2  =  0,  By  adding  Se  to  both  sides  and  using  (15), 

^(Se)  —  —  {[A  —  Bfc(BL3CTF'1CBfc.1r1BL1CTK-1C]T 

+  sl-FiQiFl  +  Bk(Bl-iCrV-lCBk-tylBTkl}Se  (17) 

If  the  error  initially  lies  in  Ker  5,  (17)  implies  that  the  error  will  never  leave  Ker  S,  Therefore, 
Ker  S  is  a  (C,  A)-in variant  subspace.  j-j 

Theorem  5.2. 

Ker  S  is  contained  in  the  unobservable  subspace  of  (HC,  A  —  LC). 

Proof.  Let  f  e  Ker  S.  By  multiplying  (15)  by  (T  from  the  left  and  £  from  the  right, 

j-  {CSQ  =  eC'ff'V-'RCl  =  0 
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Then,  HCC  =  0  because  i?Cf  =  0  and  Ker  H  =  Ker  H,  From  Theorem  5.1,  KerS  is  a 

a  nV^nant  subsPaca  Therefore,  KerS  is  contained  in  the  unobservable  subspace  of 
{HC,  A  —  LC).  j_j 

From  Theorem  4.1,  C  KerS  2  CBk.l.  From  Theorem  5.2,  CKerS  c  CB*^.  Therefore 
C,kf  S  ~  and  H  <4)  is  equivalent  to  H  (16).  Note  that  (16)  is  a  better  way  to  form  & 

which  is  used  by  the  residual  (3)  because  it  does  not  require  the  solution  to  the  limiting  Riecati 
Equation  (15). 

For  time-invariant  systems,  it  is  important  to  discuss  the  invariant  zero  directions  when 
designing  the  fault  detection  filter.  The  invariant  zeros  of  (C,  A,  F2)  will  become  part'  of  the 
eigenvalues  of  the  filter  if  their  associated  invariant  zero  directions  are  not  included  in  the 
invariant  subspace  of  F,  [1],  From  Reference  [3,6],  the  null  space  of  S  includes  all  the  invariant 
zero  directions  if  the  nuisance  fault  direction  is  modified  to  the  invariant  zero  directions. 
1  herefore,  the  invariant  zeros  will  not  become  part  of  the  filter  eigenvalues.  From  Theorem  4.1 
and  modified  nuisance  fault  direction,  the  null  space  of  S  contains  the  minimal  (C,  .4)-unobserva- 
bility  subspace  of  F2.  By  combining  with  Theorem  5.2,  the  null  space  of  S  is  equivalent  to  the 
minimal  (C,  4)-un  observability  subspace  of  F2,  and  the  limiting  generalized  least-squares  fault 
detection  filter  is  equivalent  to  the  unknown  input  observer.  Note  that  the  invariant  zero  and 
minimal  (C,  /f)-unobservability  subspace  are  only  defined  for  time-invariant  systems  For  time- 

varying  systems,  Theorems  4.1,  5.1  and  5.2  imply  that  the  null  space  of  S  is  a  similar  invariant 
subspace. 

Remark  L 

In  order  to  detect  the  target  fault,  Ft  cannot  intersect  the  null  space  of  5  which  is  unobservable 
to  the  residual.  If  it  does,  the  target  fault  will  be  difficult  or  impossible  to  detect  even  though  the 
filter  can  still  be  derived  by  solving  the  min-max  problem.  If  F,  does  not  intersect  the  null  space  of 
5,  Fj  and  F2  are  called  output  separable  [1],  and  the  output  separability  test  can  be  stated  as 
CBk _ ,  n  CB-k _  [  =0  where  is  the  Goh  transformation  of  Fl. 


6.  REDUCED-ORDER  FILTER 


n  this  section,  the  reduced-order  filter  is  derived  for  the  limiting  generalized  least-squares  fault 
detection  filter  (14).  The  reduced-order  filter  is  necessary  for  implementation  because  (14)  cannot 
be  used  due  to  the  null  space  of  S.  Since  S  is  non-negative  definite,  there  exists  a  state 
transformation  F  such  that 


rTsr  = 


'S  0" 
0  0 


where  S  is  positive  definite.  Theorem  6.1  provides  a  way  to  form  the  transformation. 


(18) 


Theorem  6.1. 

There  exists  a  state  transformation  F  where 


[Z  Ker  5]  =  f 


Zi 

0 


0 

z2 


(19) 
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Z  is  any  nx(n  k2)  continuously  differentiable  matrix  such  that  itself  and  Ker  S  span  the  state 
space  where  n  =  dim  %  and  k2  =  dim  (Ker  S).  Zx  and  Z2  are  any  (n  -  k2)  x(n-  k2)  and  k2  x  k2 
invertible  continuously  differentiable  matrices,  respectively.  Then,  the  F  obtained  from  (19) 
satisfies  (18).  v  ' 


Proof. 


Kers  =  r  °  =>  srP]  =  o  ==»  rTsr["°]  =  0 

lz2j  lz2j  |_Z2_ 

Since  Z2  is  invertible  by  definition  and  F7SF  is  symmetric,  (IB)  is  true.  | — [ 

Note  that  Theorem  6.1  does  not  define  F  uniquely  and  F  can  be  computed  a  priori  because 
Ker  S  can  be  obtained  a  priori 

By  applying  the  transformation  to  the  estimator  state,  F“!x  A  rj  =  [iff  ^T]T  Bv  multiplying 
(M)  by  F  from  the  left,  using  TF  =  I,  and  adding  rTSTt~lx  to  both  sides,  the 'limiting  filter 
can  be  transformed  into  two  equations, 

Stjt  =  S(An  Fi  i)?73  +  S(^412  —  F12)^2  4-  SM^i 

+  iSGx(DlClV^C2D2)^DlClV-'  +  CWV^HKy  -  -  C2*2)  (20a) 


0  =  -  C,ft  -  C,J2} 


where 


r"f-  r"  r'l  r-Mr-Pj"  A“],  r-B.M  cr.[c,  c.] 

LA2t  A22J  \_A2i  ^422J  M2 


r-1F,  =  r-IB 

^  N2  ’  r  Bk~'~  D. 


r-^  = 


Note  that  F  1  and  f  can  be  computed  a  priori  from  (19).  From  (20b), 


HC2  =  0 


because  y  -  C^i  ~C2rj2  is  arbitrary.  By  multiplying  (15)  by  FT  from  the  left  and  F  from  the 
right,  subtracting  T  ST  and  rsfT  from  both  sides,  and  using  Fr_1  =  I,  the  limiting  Riceati 
equation  can  be  transformed  into  two  equations, 


0  =  S[Ai2  -  r12  -  G1(DlCjV  1C2D2)~1DlClV~1C2 ] 
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Note  that  this  filter  is  equivalent  to  the  optimal  stochastic  fault  detection  filter  [12]  which  is  an 
approximate  unknown  input  observer. 


4.  LIMITING  CASE 

In  this  section,  the  robust  multiple-fault  detection  filter  is  determined  in  the  limit  as  yf  0, 
i—  1***5,  when  there  is  no  complementary  subspace.  It  is  shown  that,  if  s  =  q,  the  filter  places 
each  associated  nuisance  fault  into  the  unobservable  subspace  of  its  associated  projected 
residual  for  both  time-invariant  and  time-varying  systems.  Therefore,  the  filter  becomes 
equivalent  to  the  RDD  filter  in  the  limit  and  extends  the  RDD  filter  to  the  time-varying  case.  In 
Section  4.1,  the  geometric  structure  of  the  detection  filter  is  given  [3].  In  Section  4.2,  the  robust 
multiple-fault  detection  filter  is  determined  in  the  limit.  In  Section  4.3,  the  conditions  to  ensure 
that  the  faults  can  be  isolated  are  discussed. 

4.1.  Geometric  structure  of  detection  filter 

The  BJD  filter  places  each  fault  into  an  invariant  subspace  [3]  where 

=  nrt  ®  n  (21) 

is  called  the  minimal  (C,  ^4)-imobservability  subspace  or  the  detection  space  of  Fh  is  the 
minimal  (C,^4)-in variant  subspace  of  f  given  by 

iTt  =  lm[ful  A*«fa  fa  •••  A^fa  -.  fa,  -.  fafafa  (22) 

where  fij  is  the  yth  column  of  Fh  5jj  is  the  smallest  non-negative  integer  such  that  CA*u fij^O 
and  pi  =  dim  Ft.  is  the  subspace  spanned  by  the  invariant  zero  directions  of  (QA,F}).  The 
RDD  filter  places  each  associated  nuisance  fault  fii  into  an  invariant  subspace  —  [&\  *  *  * 

-  *  *  $~q]  which  is  the  unobservable  subspace  of  {HiC,A  —  LC)  where  L  is  the  filter  gain 
and  Hi  is  given  in  (9)  [3].  Therefore,  each  associated  nuisance  fault  is  in  the  unobservable 
subspace  of  its  associated  projected  residual. 

For  time-varying  systems,  the  minimal  (C,  ^-invariant  subspace  of  Ft  is  [10] 

=  Im[^!j0  •••  64 1A1  biX 0  *•'  biXfiv  Hmfi  ***  hpiAni  (23) 

which  is  found  from  the  iteration  defined  by  the  Goh  transformation  (10).  For  time-varying 
systems,  the  minimal  (C,  ^j-unobservability  subspace  cannot  be  determined  by  (21)  because  the 
concept  Of  invariant  zero  is  for  time-invariant  systems  only. 

Remark  1 

Equations  (22)  and  (23)  produce  the  correct  invariant  subspaces  only  when  Rank(C^)  =  pv  If 
Rank(C#f)<p„  a  new  basis  for  Ft  can  be  obtained  such  that  Rank(C#j)  =  [17].  For 

example,  for  time-invariant  systems,  Ft  =  [/}(1  fit2]  where  fix  #  fii2  and  Cfh\  =  C/5, 2#0.  Then, 
Hri  =  from  (22).  Since  Rank(C^)  =  1,  (22)  does  not  produce  the 

correct  invariant  subspace.  By  using  a  different  basis  for  Fi9  e.g.  [fit  1  ft\  —  fifh 
Im[/},i  fa  -  fa  A(ft,\  -  fa)]  from  (22)  which  is  equivalent  to  Im[/M  fa  A{fa  -  fa)]. 
Since  Rank(C^)  =  2,  (22)  produces  the  correct  invariant  subspace  using  this  new  basis  of  Ft. 
This  invariant  subspace  can  also  be  confirmed  by  using  the  recursive  algorithm  given  in 
Reference  [3], 
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4.2.  Limiting  robust  multiple-fault  detection  filter 

In  this  section,  the  robust  multiple-fault  detection  filter  is  determined  in  the  limit  as  y(  ->  0, 
i  =  1  •  •  -  5,  when  there  is  no  complementary  subspace  and  s  =  q.  The  filter  for  time-invariant 
systems  is  considered  first.  Then,  the  filter  for  time-varying  systems  is  considered  in  Remark  3  at 
the  end  of  this  section.  First,  it  is  assumed  that  in  the  limit,  ST\---SAq  are  (A  -  £,C)-invariant 
where  L  is  in  (14).  This  will  be  shown  to  be  true  later.  Then,  the  filter  gain  (14)  is  simplified  in  the 
limit  by  using  Lemma  4.1  so  that  the  simplified  filter  gain  does  not  require  the  solution  to  the 
two-point  boundary  value  problem,  (12)  and  (15).  Lemma  4.2  shows  that  the  simplified  filter 
gain  minimizes  the  cost  criterion.  Therefore,  the  simplified  filter  gain  is  equivalent  to  (14)  in  the 
limit.  Lemma  4.2  also  shows  that  (9)  is  the  optimal  projector  in  the  limit.  Finally,  Theorem  4.3 
shows  that  ST\---3Tq  are  (A  -  LC)-invariant  where  L  is  the  simplified  filter  gain.  Therefore,  the 
filter  becomes  equivalent  to  the  RDD  filter  in  the  limit.  Corollary  4.4  shows  that  F\---Fq  are 
(A  -  LC)- invariant  where  L  is  the  simplified  filter  gain.  Therefore,  the  filter  also  becomes 
equivalent  to  the  BJD  filter  in  the  limit. 

Lemma  4.1 

Define  a  new  projector  Ht  where 

H,:%  X,  Ker  H,  =  H,  =  /  - 

for  i  =  1  *  *  *  q.  In  the  limit,  Hj  has  the  following  properties: 

K>  (24a) 

H,W,=  0 

Proof 

See  Appendix  A.l. 

In  the  limit,  by  applying  Lemma  4.1  to  (14), 

(25) 

Note  that  (25)  does  not  require  the  solution  to  the  two-point  boundary  value  problem,  (12) 
and  (15),  but  just  the  solution  to  the  Riccati  equation  (13)  which  can  be  obtained  independently 
of  L.  By  using  the  asymptotic  expansion  of  i>  in  Reference  [12],  it  can  be  shown  that  Hfii 
remains  finite  in  the  limit  even  though  Pt  goes  to  infinity.  Therefore,  the  limiting  filter  gain  (25) 
remains  finite.  Lemma  4.2  shows  that  (25)  minimizes  the  cost  criterion.  Therefore,  (25)  is 
equivalent  to  (14)  in  the  limit.  Lemma  4.2  also  shows  that  (9)  is  the  optimal  projector  in  the 
limit. 

Lemma  4.2 

In  the  limit,  the  cost  criterion  associated  with  (25)  is  zero. 

Proof 

See  Appendix  A.2.  n 


(24b) 

□ 
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Remark  2 

For  the  single-fault  filter,  the  filter  gain  (20)  goes  to  infinity  in  the  limit  and  there  exists  a 
reduced-order  filter  [12].  For  the  multiple-fault  filter,  however,  the  limiting  filter  gain  (25) 
remains  finite. 

Theorem  4.3  shows  that  are  (A  —  IC)-invariant  where  L  is  in  (25).  Therefore,  the 

filter  becomes  equivalent  to  the  RDD  filter  in  the  limit.  Corollary  4.4  shows  that  are 

(A  -  IQ-invariant  where  I  is  in  (25).  Therefore,  the  filter  also  becomes  equivalent  to  the  BID 
filter  in  the  limit. 

Theorem  43 

In  the  limit,  #i  *  *  *  are  (A  —  IQ-invariant  where  L  is  in  (25). 

Proof 

See  Appendix  A.3.  □ 

Corollary  4  A 

In  the  limit,  are  (A  —  IQ-invariant  where  L  is  in  (25). 

Proof 

See  Appendix  A.4.  □ 

Remark  3 

For  time-varying  systems,  the  minimal  (C,  ^j-unobservability  subspace  cannot  be  determined  by 
(21)  because  the  concept  of  invariant  zero  is  for  time-invariant  systems  only.  However,  by  letting 
$~i  =  Kerfli  which  is  given  in  Appendix  A.3,  it  can  be  shown  that  Ker  11/  is  included  in  the 
unobservable  subspace  of  ( HtC,A-LC)  where  L  is  in  (25)  and  Ht  is  given  in  (9)  [11,12]. 
Furthermore,  Ker  nf*  is  equivalent  to  the  unobservable  subspace  of  (HiC,A  —  LC)  when  there  is 
no  complementary  subspace.  Then,  all  the  lemmas,  theorem  and  corollary  in  this  section  can  be 
shown  similarly  for  time-varying  systems.  Therefore,  the  filter  extends  the  RDD  and  BID  filter 
to  the  time-varying  case. 

Remark  4 

In  the  limit,  by  using  Lemma  4.2  and  that  tv{HiCPiCT  ff)  is  finite  [12],  the  robust  multiple-fault 
detection  filter  problem  satisfies 

mmff} 

mmmT\ 

for  i  =  I'-q.  This  implies  that  the  transmissions  from  the  associated  nuisance  faults  to  their 
associated  projected  residuals  are  zero. 

43.  Condition  on  fault  detection  and  identification 

In  this  section,  three  conditions  to  ensure  that  the  faults  can  be  detected  and  identified  are 
assumed.  First,  C2T\  *  *  *  C^q  are  independent.  If  they  are  not  independent,  different  faults  will 
produce  the  same  non-zero  projected  residuals  and  therefore  the  faults  cannot  be  identified.  This 
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is  equivalent  to  the  output  separability  condition  in  Reference  [3].  Note  that  CF\  •  •  •  C3Tq  are 
independent  if  and  only  if  C#j  •  •  •  are  independent. 

The  other  two  conditions  are  assumed  for  time-invariant  systems  only.  The  first  condition  is 
that  the  invariant  zeros  of  (C,A, [Fi  •  •  •  Fq})  are  either  the  invariant  zeros  of  (C,A, F,),  i=  l-q, 
or  in  the  left-half  plane.  This  is  from  the  mutually  detectable  condition  for  the  RDD  filter 
because  the  robust  multiple-fault  detection  filter  becomes  equivalent  to  the  RDD  filter  in  the 
limit.  F\  -Fq  are  mutually  detectable  if  ( C,  A ,  [Fi  •  •  •  F?])  does  not  have  more  invariant  zeros  than 
(C,A,Fj),  i=  1  •••q  [3],  If  F\  ■■•Fq  are  not  mutually  detectable,  the  extra  invariant  zeros  will 
become  part  of  the  eigenvalues  of  the  detection  filter.  If  the  extra  invariant  zeros  are  in  the  right- 
half  plane,  no  stable  detection  filter  can  be  found  to  isolate  these  q  faults.  A  numerical  example 
is  given  in  Section  6.2.3.  The  second  condition  is  that  (C,A,Ft)  cannot  have  invariant  zeros  at  the 
origin  if  p,-  needs  to  be  detected  [12].  This  ensures  a  non-zero  projected  residual  in  steady  state 
when  its  associated  target  fault  occurs. 


5.  MINIMIZATION  WITH  RESPECT  TO 


In  this  section,  the  robust  multiple-fault  detection  filter  problem  is  solved  with  derived 

from  solving  the  minimization  problem  instead  of  defined  a  priori  by  (9),  From  (1 1),  by  using 
Hi  =  pjpf,  the  minimization  problem  becomes 


1 

mm  - 

Lfiy  H,  t\  -  to 


r  tr  £  pJC(W;+P,)CTPi 
Jto  f=  i 


dr 


subject  to  (12)  and  pfp.  =  lm.  where  m,-  is  the  rank  of  (9).  By  using  matrix  Lagrange  multipliers 
Kj  and  E,-  to  form  the  variational  Hamiltonian,  the  first-order  necessary  conditions  imply 
that  the  optimal  solution  for  L  and  the  dynamics  of  Kt  are  still  (14)  and  (15),  respectively. 
Further,  from  the  first-order  necessary  condition  C(Wj  +  Pi)CTp,  =  p,E„  the  optimal  solution 
for  Hj  is 


=  Pi, 2  •"  Pijn}\Pi,\  P&  PiM?  (26) 

where  pu  •  •  •  pI>(  are  the  eigenvectors  of  C(Wt  +  p)Cr  associated  with  the  smallest  m( 
eigenvalues.  To  obtain  the  optimal  solutions  for  L  and  H\  (12),  (14),  (15)  and  (26) 

have  to  be  solved  simultaneously.  For  the  infinite-time  case,  (14),  (17),  (19)  and  (26)  have  to  be 
solved  simultaneously.  In  Section  4.2,  it  is  shown  that  (9)  minimizes  the  cost  criterion  in  the 
limit.  Therefore,  (26)  becomes  equivalent  to  (9)  in  the  limit.  Note  that,  for  time-invariant 
systems,  (9)  is  the  projector  used  by  the  RDD  filter  [3]. 


6.  EXAMPLE 

In  this  section,  two  numerical  examples  are  used  to  demonstrate  the  robust  multiple-fault 
detection  filter.  In  Section  6.1,  the  filters  are  derived  in  the  forms  of  unknown  input  observer, 
BJD  filter  and  RDD  filter,  respectively.  In  Section  6.2,  the  filters  are  derived  to  show  that  the 
filter  has  behaviours  similar  to  the  RDD  and  BJD  filters. 
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6.1.  Example  1 

In  this  section,  a  linear  time-invariant  system  for  the  F16XL  aircraft  [6]  is  used  to  demonstrate 
the  performance  of  the  robust  multiple-fault  detection  filter.  The  system  has  four  states 
(longitudinal  velocity  xu,  normal  velocity  xw,  pitch  rate  xq  and  pitch  angle  xg),  one  control  input 
(elevon  deflection  angle  ««),  four  measurements  (longitudinal  velocity  yu,  normal  velocity  yw, 
pitch  rate  yq  and  pitch  angle  yg )  and  one  disturbance  input  (wind  gust  «wg),  The  system  matrices 
are 


'-0.0674 

0.0430 

-0.8886 

-0.5587' 

'-0.1672' 

0.0205 

-1.4666 

16.5800 

-0.0299 

?  Bs  = 

-1.5179 

0.1377 

-1.6788 

-0.6819 

0 

-9.7842 

0 

0 

1 

0 

0 

Bw g  — 


0.0430 

-1.4666 

-1.6788 

0 


C  =  I 


Three  faults  in  pitch  angle  sensor  yg,  elevon  deflector  us  and  wind  gust  nwg  are  considered.  In 
this  example,  the  wind  gust  is  considered  as  a  fault  instead  of  a  process  noise.  The  fault 
directions  are  [4] 


'0 

-0.5587' 

'-0.1672' 

'  0.0430  ' 

0 

-0.0299 

-1.5179 

-1.4666 

0 

0 

,  FS  = 

-9.7842 

II 

-1.6788 

,1 

0 

0 

0 

In  Section  6.1.1,  the  filters  are  derived  in  the  form  of  unknown  input  observer  where  s  =  1.  In 
Section  6.1.2,  the  filter  is  derived  in  the  form  of  the  BJD  filter  where  s  =  3.  In  Section  6.1.3,  the 
filter  is  derived  in  the  form  of  the  RDD  filter  where  s  =  2.  In  Section  6.1.4,  the  filter  is  derived  to 
show  that  the  sensitivity  of  the  projected  residuals  to  their  associated  target  faults  can  be 
enhanced. 


6.1.1.  Unknown  input  observer 

In  this  section,  the  filters  are  derived  in  the  form  of  unknown  input  observer  where  s  =  1.  Since 
each  filter  can  detect  only  one  fault,  three  filters  are  needed.  Let  F\  =  Fg,  Fi  —  F&  and  F3  =  Fwg, 
The  weightings  are  chosen  as  y,  =  y2  =  y3  =  10-6,  Q\  =  0.17,  Q2  =  g3  =  1  and  V  =  I.  The 
steady-state  solutions  of  (13)  are  obtained  for  i  =  1  •  •  •  3,  respectively.  Then,  three  single-fault 
filters  (3)  are  obtained  by  (20).  Figure  1  shows  the  frequency  response  from  each  fault  to  the 
projected  residual  H,r  (4)  of  each  filter.  Note  that  each  filter  has  only  one  projected  residual  H,r 
for  detecting  the  fault  Fj.  The  projectors  H \  ■  •  ■  H3  are  defined  by  (9).  The  dashed  line  represents 
the  pitch  angle  sensor  fault.  The  dashdot  line  represents  the  elevon  deflector  fault.  The  solid  line 
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1st  filter  2nd  filter  3rd  filter 


Frequency  (rad/s)  Frequency  (rad/s)  Frequency  (rad/s) 


Figure  1.  Frequency  response  of  the  three  single-fault  filters  when  s  =  1. 


represents  the  wind  gust  fault.  This  example  shows  that  the  projected  residual  of  each  filter  is 
only  sensitive  to  its  associated  target  fault,  but  not  to  its  associated  nuisance  fault, 

6.1.2.  Beard-Jones  detection  filter 

In  this  section,  the  filter  is  derived  in  the  form  of  the  BJD  filter  where  s  =  3.  Since  the  filter  can 
detect  all  three  faults,  only  one  filter  is  needed.  The  filter  gain,  satisfying  (17),  (18)  and  (19),  is 
obtained  by  using  the  gradient  method  to  solve  (16)  numerically  with  defined  a  priori 

by  (9).  Figure  2  shows  the  frequency  response  from  each  fault  to  the  three  projected  residuals 
H\r-  ■  •  H^r  (4)  of  the  filter  (3).  This  example  shows  that  one  multiple-fault  filter  works  as  well  as 
three  single-fault  filters. 

6.1.3.  Restricted  diagonal  detection  filter 

Since  the  wind  gust  is  a  disturbance,  it  does  not  need  to  be  detected,  but  only  needs  to  be 
blocked.  Therefore,  in  this  section,  the  filter  is  derived  in  the  form  of  the  RDD  filter  where  5  =  2. 
The  filter  gain,  satisfying  (17),  (18)  and  (19),  is  obtained  by  using  the  gradient  method  to  solve 
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1st  projected  residual 


2nd  projected  residual 


3rd  projected  residual 


Figure  2.  Frequency  response  of  the  multiple-fault  filter  when  s  =  3. 


(16)  numerically  with  H\  and  Hi  defined  a  priori  by  (9).  In  Figure  3,  the  left  and  middle  figures 
show  the  frequency  response  from  each  fault  to  the  two  projected  residuals,  H\r  and  H2r  (4),  of 
the  filter  (3).  Note  that  the  filter  has  only  two  projected  residuals  because  only  two  faults,  F\  and 
F2,  are  detected.  These  two  figures  show  that  the  pitch  angle  sensor  fault  and  elevon  deflector 
fault  can  still  be  detected  and  identified  even  though  s  =  2.  To  compare  with  the  filter  derived  in 
the  previous  example  where  s  =  3,  the  right  figure  in  Figure  3  shows  the  frequency  response 
from  each  fault  to  the  projected  residual  Hyr  used  for  detecting  the  wind  gust  fault  in  previous 
example.  This  figure  shows  that  the  wind  gust  fault  can  no  longer  be  identified  from  the  other 
two  faults.  This  example  shows  that  the  multiple-fault  filter  still  works  well  after  relaxing  the 
constraint  on  detecting  the  wind  gust  fault. 

6.1.4.  Enhancement  of  associated  target  fault  sensitivity 

In  this  section,  another  filter  in  the  form  of  the  RDD  filter  where  s  =  2  is  derived  to  show  that 
the  sensitivity  of  the  projected  residuals  to  their  associated  target  faults  can  be  enhanced.  The 
weightings  are  the  same  except  Q\  —  0.64/  and  Q2  =  4.73,  In  Figure  4,  the  performance  of  this 
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Frequency  (rad/s)  Frequency  (rad/s)  Frequency  (rad/s) 


Figure  3.  Frequency  response  of  the  multiple-fault  filter  when  s  =  2. 


filter  is  compared  to  the  filter  derived  in  the  previous  example.  The  left  figure  shows  the 
frequency  response  from  the  pitch  angle  sensor  fault  to  its  associated  projected  residuals  when 
fit  =  0.1/  and  0.64 /,  respectively.  The  right  figure  shows  the  frequency  response  from  the  elevon 
deflector  fault  to  its  associated  projected  residuals  when  Q2  =  1  and  4,73,  respectively.  This 
example  shows  that  the  sensitivity  of  the  projected  residuals  to  their  associated  target  faults  can 
be  enhanced  by  increasing  the  weightings  of  the  associated  target  faults. 

6.2.  Example  2 

In  this  section,  three  numerical  examples  are  used  to  show  that  the  robust  multiple- 
fault  detection  filter  has  behaviours  similar  to  the  RDD  and  BJD  filters.  In  Section  6.2.1,  the 
filter  is  derived  when  the  fault  has  an  invariant  zero  in  the  right-half  plane.  In  Section  6,2.2, 
the  filter  is  derived  when  the  fault  has  an  invariant  zero  in  the  left-half  plane.  In  Section  6.2.3, 
the  filter  is  derived  when  the  faults  are  not  mutually  detectable. 
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1  st  projected  residual  2nd  projected  residual 


Figure  4,  Frequency  response  of  the  multiple-fault  filter  when  s  =  2. 


6.2.1.  Right-half-plane  invariant  zero. 

Consider  the  time-invariant  system  from  Reference  [4], 


"0  3  4" 

"  1  ‘ 

-3' 

1  2  3 

,  c= 

0  1  0" 

0  0  1 

.  F,= 

-0.5 

,  f2  = 

1 

0  2 

0.5 

0 

There  is  no  process  noise.  (C,A,Fi)  has  an  invariant  zero  at  3  and  the  invariant  zero  direction  is 
v  =  [l  0  Of.  By  using  (21),  ST\  =  ImFi  and  =  Im[F2  v].  Since  9\  ©  ^  there  is  no 
complementary  subspace. 

A  multiple-fault  filter  is  derived  similarly  as  before  to  detect  and  identify  these  two  faults.  The 
weightings  are  chosen  as  yx  =  y2  =  10-6,  Qi  =  Qz  =  0.25  and  V  =  /.  The  eigenvectors  of  the 
filter  are  very  close  to  #}  and  5^  similar  to  the  BID  filter.  Since  the  invariant  zero  direction  is 
approximately  included  in  the  invariant  subspace  of  F2  generated  by  the  filter,  none  of  the 
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eigenvalues  of  the  filter  is  close  to  the  invariant  zero  at  3  [3].  The  eigenvalues  of  the  filter  are 
-0.5865,  -5.3789  and  -7.1102. 


6.2.2.  Left-half-plane  invariant  zero 

Consider  the  same  time-invariant  system  from  Section  6.2.1  except  F2  =  [3  1  0]T.  ( C,A,F2 ) 
has  an  invariant  zero  at  -3  and  the  invariant  zero  direction  is  v  =  [1  0  0]T.  By  using 
(21),  2T\  —  Im  F\  and  3~2  =  ImJTS  v].  Since  3T\  ©  3F2  =  9C,  there  is  no  complementary 
subspace. 

A  multiple-fault  filter  is  derived  with  the  same  weightings  as  in  Section  6.2.1.  The  eigenvectors 
of  the  filter  are  very  close  to  and  ST2  similar  to  the  BJD  filter.  Since  the  invariant  zero 
direction  is  approximately  included  in  the  invariant  subspace  of  F2  generated  by  the  filter,  none 
of  the  eigenvalues  of  the  filter  is  close  to  the  invariant  zero  at  -3  [3].  The  eigenvalues  of  the  filter 
are  -0.5865,  -5.3789  and  -7.1102. 


Remark  5 

For  the  single-fault  filter  [12],  the  invariant  zero  directions  associated  with  the  left-half-plane 
invariant  zeros  are  not  included  in  the  invariant' subspace  and  part  of  the  eigenvalues  of  the  filter 
are  very  close  to  the  invariant  zeros.  Although  the  invariant  zero  directions  associated  with  the 
right-half-plane  invariant  zeros  are  included  in  the  invariant  subspace,  part  of  the  eigenvalues  of 
the  filter  are  very  close  to  the  mirror  images  of  the  invariant  zeros.  To  avoid  this  situation,  the 
fault  directions  have  to  be  modified.  However,  as  demonstrated  by  the  numerical  examples  in 
Sections  6.2.1  and  6.2.2,  the  multiple-fault  filter  automatically  includes  the  invariant  zero 
directions  in  the  invariant  subspaces  and  none  of  the  eigenvalues  of  the  filter  is  close  to  the 
invariant  zeros  or  their  mirror  images. 


6.2.3.  Non-mutually  detectable  faults 

Consider  the  same  time-invariant  system  from  Section  6.2.1  except  F2  =  [5  1  1]T.  F\  and  F2  are 
not  mutually  detectable  because  (C,A,  [Ft  F2J)  has  an  invariant  zero  at  -1.5  while  (C,A,FX)  and 
{C,A,F2)  do  not  have  any  invariant  zero.  By  using  (21),  3TX  =  lmFi  and  3T2  =  ImF2.  Since 
2F\  ®  .5^2  <=  X,  there  is  a  complementary  subspace. 

A  multiple-fault  filter  is  derived  with  the  same  weightings  as  in  Section  6.2.1.  Two  of  the 
eigenvectors  of  the  filter  are  very  close  to  9\  and  3T2  similar  to  the  BJD  filter.  Since  F\  and  F2  are 
not  mutually  detectable,  one  of  the  eigenvalues  of  the  filter  is  very  close  to  the  extra  invariant 
zero  at  -1.5  [3],  The  eigenvalues  of  the  filter  are  -1.5008,  -5.7648  and  -6.8185. 

Remark  6 

A  multiple-fault  filter  is  also  derived  for  two  non-mutually  detectable  faults  where  the 
extra  invariant  zero  is  in  the  right-half  plane.  Although  a  stable  filter  can  be  derived  numerically 
by  minimizing  the  cost  criterion,  the  minimal  cost  is  large  and  the  filter  cannot  isolate  the 
faults.  This  is  consistent  with  the  BJD  filter  in  that  the  extra  invariant  zero  will  become 
one  of  the  eigenvalues  of  the  filter  if  the  filter  generates  the  invariant  subspaces  to  isolate 
the  faults  [3],  Therefore,  it  is  impossible  to  obtain  a  stable  multiple-fault  filter  that  can 
isolate  the  faults.  However,  two  single-fault  filters  can  be  used  to  monitor  these  two 
faults. 
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7.  CONCLUSION 

Different  from  other  design  algorithms  for  the  RDD  or  BJD  filter  which  explicitly  force  the 
geometric  structure  by  using  eigenstructure  assignment  or  geometric  theory,  the  robust  multiple- 
fault  detection  filter  is  derived  from  solving  a  stochastic  minimization  problem  and  only  in  the 
limit,  is  the  geometric  structure  of  the  RDD  filter  recovered  and  the  faults  are  completely 
isolated.  When  it  is  not  in  the  limit,  the  filter  only  isolates  the  faults  within  approximate 
unobservable  subspaces.  This  new  feature  allows  the  filter  to  be  potentially  more  robust  because 
of  the  additional  design  freedom  which  allows  different  degrees  of  fault  isolation.  Furthermore, 
a  mechanism  that  enhances  the  sensitivity  of  the  projected  residuals  to  their  associated  target 
faults  is  provided.  Finally,  the  filter  can  be  applied  to  time-varying  systems.  Although  the 
process  of  deriving  the  filter  gain  requires  the  solution  to  a  two-point  boundary  value  problem, 
the  filter  gain  computation  can  be  done  off-line  so  that  the  filter  implementation  is  as 
straightforward  as  the  RDD  filter.  However,  further  research  is  needed  in  developing  a 
numerical  algorithm  to  solve  the  optimization  problem  more  efficiently. 
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By  multiplying  (12)  by  Ht  from  the  left  and  right,  substituting  (Al)  and  using  Lemma  A.2  in 
Appendix  A.5 

Hi[-  Wj  +  (A-  PjCTV~lC)Wi  +  Wj(A  -  PiCTV-xCf  -  WtCT V~l CW/JHi  =  0 

Note  that,  from  (20),  A  -  PjCTV~lC  is  the  closed-loop  A  matrix  of  the  filter  when  only  the  fault 
Fj  is  detected.  Then 

Wt  =  (A  -  PjCT  V~l  C)Wj  +  Wt(A  -  PiCTV~lCf  -  WiCTV~lCWi  +  t,ff 

where  Im  %  =  because  Ker  Hi  =  3'i.  Since  is  (A  -PiCTV~l  C)-invariant  [12],  the 
controllable  subspace  of  (A  -  P,CTV~lC,  t)  is  ST t  and  Im  W,  =  STt.  Since  Ker  H,  =  HsW,  = 
0.  This  completes  the  proof  for  (24b). 


A.2.  Proof  of  Lemma  4.2 

By  multiplying  (12)  by  Ht  from  the  left  and  right,  substituting  (25)  and  using  Lemma  A.2  in 
Appendix  A.5 

Hi{-  Wj  + (A—  PiCTV-1  C)W,  +  W,(A  -  PjCTV~x  Cf]Hj  =  0 

Then 

Wi  =  (A-  PiCTV~lC)Wi  +  Wt(A  -  P,CTV-lCf  +  f,ff 

where  Im  7j  =  #)  because  Ker  Since  is  (A-pCTV~lC)- invariant  [12],  the 

controllable  subspace  of  (A  -  PiCTV~lC,t)  is  &}.  Then,  the  image  of  the  controllability 
grammian  W,  is  J).  Since  Ker  4  =  Ci)  from  (9),  ^CWtC THi  =  0.  Therefore 

A.2.  Proof  of  Theorem  4.3 


Since  Pt  goes  to  infinity  in  the  limit,  II,  A  p-*  has  a  null  space  [12]  and 

-%  =  TliA+ATni  +  n,{^FiQiFTi  -FiQiFf  CTV~XC  (A2) 

When  the  associated  nuisance  fault  occurs,  the  dynamic  equation  of  the  error  without  process 
and  sensor  noises  can  be  written  as 


n,e  =  n,(/l  —  LC)e  +  YljFjfii 
By  adding  n,e  to  both  sides  and  substituting  (A2) 

^{n,e)  =  -  me + ^rn,.  +  n,  (jMtf  -  f&fJ  +  bwqwbt\ 

xn,  -  CrK_lcj  e  +  n,Pj /2,- 
Let  ft,-  =  limy_o  n.  Since  Ker  ft,-  =  STi  [12] 

hj  = 


n„  i = j 

,  o, 


(A3) 


(A4) 
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which  can  be  shown  similarly  to  Lemma  A.2  in  Appendix  A.5.  In  the  limit,  by  substituting  (25) 
and  (A4)  into  (A3) 

|  (n,e)  =  -  At  +  fl,  (i  FiQtP]  -  FiQiFf  +  BwQwBl)  fl,e  +  (A5) 

If  the  error  initially  lies  in  Ker  ft/,  (A5)  implies  that  the  error  will  never  leave  Ker  fl/  because 
fliFi  —  0  [12]*  Therefore,  Ker  ft/  is  (A  —  LC)- invariant  where  L  is  in  (25),  Since  Ker  A,  ==  ^  [12], 
is  (A  —  LC)-invariant  where  L  is  in  (25), 

A,  4.  Proof  of  Corollary  4.4 

When  s  =  qi  ^  =  2T\  n  ***  n  n  n  n 3Tq.  From  Theorem  43,  &\  are 

(A  -  IQ-invariant  where  L  is  in  (25).  Therefore,  •  •  *  !Fq  are  (A  -  LC)- invariant  where  L  is  in  (25). 

A. 5,  Lemmas 


Lemma  AJ 

There  exists  a  state  transformation  T 

Zx  0  0  * 

[3rx  •  ••  #}]  =  r  o  ■.  0 

.0  0  2q_ 

where  Zt,  i  =  \  ■  ■  ■  q,  are  any  invertible  matrices  with  dimension  equivalent  to  dim  such  that 
K\  ■■■  Kq  are  in  the  form  of 

TO  0  01 


rjc.r-1  '  |,  r^2r=  o  k2  o  •••  FKqr  = 

L  J  [0  0  0. 

in  the  limit  where  K\---Kq  are  invertible  and  H\---Hq  are  in  the  form  of 

ro  0  01 

H  0 

rTHir=  1  ,  rTH2r  =  o  h2  o  •••  rTnqr  = 

oo 

L°  o  o. 

where  H\  --Hq  are  invertible. 


Ki  0 
0  0 


‘0 

0 

O' 

'H{ 

ot 

0 

,  TtH2T  = 

0 

0 

Hi 

0 

.0 

0 

0. 

Proof 

Since  Ker  Hi  =  C5)  from  (9)  and  is  (A  —  LC)-invariant  in  the  limit  by  the  assumption,  the 
unobservable  subspace  of  (H,C,A  —  LC)  is  Then,  from  the  Lyapunov  equation  (15),  the  null 
space  of  the  observability  grammian  K,  is  §"t  in  the  limit.  For  j  =  1 

0  1  0  I 

Ker  K\  =  3~i=T  .  =  0=^rrKir 

Zi  z, 
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Abstract 

A  fault  detection  and  identification  algorithm,  called  optimal  stochastic  fault  detection  filter,  is  determined.  The  objective  of  the  filter 
is  to  detect  a  single  fault,  called  the  target  fault,  and  block  other  faults,  called  the  nuisance  faults,  in  the  presence  of  the  process  and 
sensor  noises.  The  filter  is  derived  by  maximizing  the  transmission  from  the  target  fault  to  the  projected  output  error  while  minimizing  the 
transmission  from  the  nuisance  faults.  Therefore,  the  residual  is  affected  primarily  by  the  target  fault  and  minimally  by  the  nuisance  faults. 
The  transmission  from  the  process  and  sensor  noises  is  also  minimized  so  that  the  filter  is  robust  with  respect  to  these  disturbances.  It  is 
shown  that  the  filter  recovers  the  geometric  structure  of  the  unknown  input  observer  in  the  limit  where  the  weighting  on  the  nuisance  fault 
transmission  goes  to  infinity.  Further,  the  asymptotic  behavior  of  the  filter  near  the  limit  is  determined  by  using  a  perturbation  method. 
Filter  designs  can  be  obtained  for  both  time-invariant  and  time-varying  systems. 
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1*  Introduction 

Any  system  under  automatic  control  demands  a  high 
degree  of  system  reliability.  This  requires  a  health  moni¬ 
toring  system  capable  of  detecting  any  plant,  actuator  and 
sensor  faults  as  they  occur  and  identifying  the  faulty  com¬ 
ponents.  One  approach,  analytical  redundancy  which  re¬ 
duces  the  need  for  hardware  redundancy,  uses  the  modeled 
dynamic  relationship  between  system  inputs  and  measured 
system  outputs  to  form  a  residual  process  which  can  be  used 
for  detecting  and  identifying  faults.  A  popular  approach 
to  analytical  redundancy  is  the  unknown  input  observer 
(Chen  Sc  Speyer,  2000;  Chung  Sc  Speyer,  1998;  Frank, 
1990;  Massoumnia,  Verghese,  Sc  Willsky,  1989;  Patton  Sc 
Chen,  1992)  which  divides  the  faults  into  two  groups:  a 
single  target  fault  and  possibly  several  nuisance  faults.  The 
nuisance  faults  are  placed  in  an  invariant  subspace  which 
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is  unobservable  to  the  residual.  Therefore,  the  residual  is 
only  sensitive  to  the  target  fault,  but  not  to  the  nuisance 
faults. 

In  this  paper,  a  design  algorithm,  called  optimal  stochas¬ 
tic  fault  detection  filter,  is  determined  for  the  unknown  input 
observer.  The  filter  is  derived  by  maximizing  the  transmis¬ 
sion  from  die  target  fault  while  minimizing  the  transmission 
from  the  nuisance  faults.  The  transmission  is  defined  on  the 
projected  output  error  by  using  a  projector  to  be  derived  from 
solving  the  optimization  problem.  Therefore,  the  residual  is 
affected  primarily  by  the  target  fault  and  minimally  by  the 
nuisance  faults.  The  transmission  from  the  process  and  sen¬ 
sor  noises  is  also  minimized  so  that  the  filter  is  robust  with 
respect  to  these  disturbances.  Since  certain  types  of  model 
uncertainties  can  be  modeled  as  additive  noises  (Patton  Sc 
Chen,  1992;  Douglas,  Chen  Sc  Speyer,  1997)  the  filter  can 
also  be  made  robust  to  these  model  uncertainties. 

In  the  limit  where  the  weighting  on  the  nuisance  fault 
transmission  goes  to  infinity,  the  filter  blocks  the  nuisance 
faults  completely.  It  is  shown  that  the  filter  places  the  nui¬ 
sance  faults  into  a  minimal  (C,A  )-unobservability  subspace 
for  time-invariant  systems  and  a  similar  invariant  subspace 
for  time-varying  systems.  Therefore,  the  filter  recovers 
the  geometric  structure  of  the  unknown  input  observer  in 
the  limit  and  extends  the  unknown  input  observer  to  the 
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time-varying  case  similar  to  Chen  and  Speyer  (2000)  and 
Chung  and  Speyer  (1998),  These  limiting  results  are  impor¬ 
tant  in  ensuring  that  both  fault  detection  and  identification 
can  occur.  For  time-invariant  systems,  the  nuisance  fault 
directions  are  generalized  to  prevent  the  invariant  zeros  of 
the  nuisance  faults  or  their  mirror  images  from  becoming 
part  of  the  eigenvalues  of  the  filter. 

The  behavior  of  the  filter  near  and  in  the  limit  can  be 
determined  by  using  a  perturbation  method.  In  particular, 
the  perturbation  method  captures  the  asymptotic  behavior  of 
the  Riccati  equation  that  defines  the  filter  gain  and  general¬ 
izes  the  result  of  Kwakemaak  and  Sivan  (1972),  Note  that 
Chen  and  Speyer  (2000)  and  Chung  and  Speyer  (1998)  use 
die  Goh  transformation  in  singular  optimal  control  theory 
(Bell  8c  Jacobson,  1975;  Moylan  Sc  Moore,  1971)  to  deter¬ 
mine  the  filter  in  the  limit.  Although  the  Goh  transforma¬ 
tion  cannot  determine  the  asymptotic  behavior  of  the  filter 
near  the  limit,  it  is  shown  that  it  produces  a  limiting  Riccati 
equation  which  is  the  same  as  that  determined  from  the  per¬ 
turbation  method.  Finally,  the  asymptotic  approximation  to 
the  ill-conditioned  Riccati  equation  near  the  limit  provides 
a  robust  numerical  algorithm  by  eliminating  die  large  coef¬ 
ficient  in  the  Riccati  equation. 

Hie  problem  is  formulated  in  Section  2  and  its  solution 
is  derived  in  Section  3.  In  Section  4,  the  limiting  properties 
of  die  filter  are  determined.  In  Section  5,  the  limiting  and 
asymptotic  behaviors  of  the  filter  are  determined  by  using 
the  perturbation  method.  In  Section  6,  numerical  examples 
are  given. 


2,  Problem  formulation 

Consider  a  linear  time-varying,  uniformly  observable 
system, 

x  =  Ax  4  Buti  4  Bww,  (Is) 

y  =  CxA-v,  (lb) 

where  u  is  the  control  input,  y  is  the  measurement,  w  is 
the  process  noise  and  v  is  the  sensor  noise.  Following  the 
development  in  (White  8c  Speyer,  1987;  Chung  8c  Speyer, 
1998),  any  plant,  actuator  and  sensor  faults  can  be  modeled 
as  additive  terms  in  the  state  equation  (la).  Therefore,  a 
linear  system  with  q  faults  can  be  modeled  by 

9 

x  —  Ax  4  Buu  4  Bww  4  Fifij ,  (2a) 

i 

y  =  Gc  +  v.  (2b) 

The  fault  magnitudes  pt  are  unknown  and  arbitrary  func¬ 
tions  of  time  that  are  zero  when  there  is  no  fault.  The  fault 
directions  Ff*  are  maps  that  are  apriori  known.  Assume  the 
Fi's  are  monic  so  that  pi  ^  0  implies  Fjpi  ^  0.  Since  the 
optimal  stochastic  fault  detection  filter  is  designed  to  detect 
only  one  fault  and  block  other  faults,  let  ji\  = pi  be  the  target 


fault  and  /i2 =[p]  **  -  p]„  {  pj+l  •  *  *  /x^]T  be  the  nuisance  fault. 
Then,  (2)  can  be  rewritten  as  (Massoumnia  et  aL,  1989) 

x  =  Ax  4*  Buu  4-  Bww  4  F\ix\  4  F2£i2#  (3a) 


y  =  Cx  4n,  (3b) 

where  Fj  =  Ff  and  F1^[Fl-*  F,*_,  Fm  *  •  •  Fgl 
The  objective  of  the  optimal  stochastic  fault  detection 
filter  problem  is  to  find  a  filter  gain  L  for  the  linear  observer, 

$  =  Ax  4  Buu  4  L(y  -  Cx)  (4) 

and  a  projector  H  for  the  residual, 

r~H(y  —  Cx)  (5) 

such  that  the  residual  is  affected  primarily  by  the  target  fault 
fii  and  minimally  by  the  nuisance  fault  fii,  process  noise 
w ,  sensor  noise  v  and  initial  condition  error  x(to)  —  x(to). 
It  is  assumed  that  /q,  /i2,  w  and  v  are  zero  mean,  white 
Gaussian  noises  with  power  spectral  densities  Qu  Qi,  Qw 
and  F,  respectively,  and  the  initial  state  x(tQ)  is  a  random 
vector  with  variance  P0.  It  is  also  assumed  that  fi2i  w 
and  v  are  uncorrelated  with  each  other  and  with  x(t0). 

By  using  (3)  and  (4),  the  dynamic  equation  of  the  error, 
e  =  x  -x,  is 

e  —  (A  —  LC)e  4  Fiji\  4  F2ju2  4  Bww  —  Lv.  (6) 

Then,  the  error  can  be  written  as 


e(t)  =  $(tftQ)e(t0) 

4  f  #(t,  r)(Fi/q  4  F2/*2  +  Bww  -  Lv)  dt  (7) 

JtQ 

subject  to 

jt  #&f0)  =  (4  -FC)#(t,r0),  (8) 

where  #(44o)  —  /•  The  residual  (5)  can  be  written  as  r  = 
H{Ce  4 v). 

An  optimal  stochastic  fault  detection  filter  problem  for¬ 
mulated  with  a  cost  criterion  based  on  the  residual  is  un¬ 
usable  from  the  statistical  viewpoint  since  the  variance  of 
the  residual  generates  a  ^-function  due  to  the  sensor  noise. 
Therefore,  the  cost  criterion  will  be  based  on  the  projected 
output  error  H Ce.  In  order  to  determine  the  cost  criterion, 
define 


*1(0  =  HC 

[  dt, 

(9a) 

h2(t)  A  HC 

f  $(t,  x)F2Hi  dr, 

JtQ 

(9b) 

h 3(0  4  HC 

\<P(t,t0)e(tQ) 

4  f  #(t,  t)(Bww  -  Lv)  dt 

JtQ 


(9c) 
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From  (7),  E[hi(t)hi(t)T]  represents  the  transmission  from 
Pi  to  HCe ,  E[h2(t)h2(t)T]  represents  the  transmission  from 
Pz  to  HCe  and  E[/i3(0^3(0T]  represents  the  transmission 
from  w,  v  and  e(to)  to  HCe  where  E[  •  ]  is  the  expectation 
operator.  Note  that  e(tQ)  is  a  zero  mean  random  vector  with 
variance  P0  if  *(h)  =  E[x(t0)}, 

The  optimal  stochastic  fault  detection  filter  problem  is  to 
find  the  filter  gain  L  and  the  projector  H  which  minimize 
the  cost  criterion, 

J  =  tr  |i  E[h2(t)h2(t)7]  +  E[h3(t)h(t)7] 

-E[*,(/)*,(0t]},  (10) 

where  /  is  the  current  time  and  y  is  a  positive  scalar.  Making 
y  small  places  a  large  weighting  on  reducing  the  nuisance 
fault  transmission.  The  trace  operator  forms  a  scalar  cost 
criterion  of  the  matrix  output  error  variance.  Note  that  the 
power  spectral  densities  Q\  and  Q2  are  considered  as  design 
parameters.  Since  no  assumption  is  made  on  the  fault  mag¬ 
nitudes,  their  white  noise  representation  is  a  convenience. 
When  Q\  increases,  the  transmission  from  the  target  fault 
increases.  When  Q2  increases,  the  transmission  from  the  nui¬ 
sance  fault  decreases.  However,  the  power  spectral  densities 
Qw  and  V,  and  the  variance  Po  can  have  physical  values. 
When  Qw,  V  and  Po  increase,  the  transmission  from  the  pro¬ 
cess  noise,  sensor  noise  and  initial  condition  error  decreases, 
respectively. 

Since  the  effect  of  the  process  and  sensor  noises  on 
die  residual  is  explicitly  minimized,  the  filter  is  robust 
with  respect  to  these  disturbances.  Certain  types  of  model 
uncertainties  can  also  be  modeled  as  additive  noises 
(Patton  Sc  Chen,  1992;  Douglas  et  ah,  1997).  Therefore, 
the  filter  can  be  made  robust  to  these  model  uncertainties. 
In  Section  4,  it  is  shown  that  the  filter  recovers  the  geomet¬ 
ric  structure  of  the  unknown  input  observer  in  the  limit  as 
y  — >  0  and  the  nuisance  fault  is  completely  blocked.  When 
it  is  not  at  the  limit,  the  filter  is  an  approximate  unknown 
input  observer  and  the  nuisance  fault  is  partially  blocked. 
Since  the  approximate  unknown  input  observer  (Chung  Sc 
Speyer,  1998;  Chen  Sc  Speyer,  2000)  has  the  additional  de¬ 
sign  freedom  to  determine  how  much  of  the  nuisance  fault 
is  to  be  blocked,  it  is  potentially  more  robust  than  the  clas¬ 
sical  unknown  input  observer  (Frank,  1990;  Massoumnia 
et  aL,  1989;  Patton  &  Chen,  1992). 


3.  Solution 

In  this  section,  the  minimization  problem  given  by  (10) 
is  solved.  By  using  (9),  the  cost  criterion  rewritten  as 

J  =  tr  j//C  [/'  <f(/,T)  (lVLt  +  -F2Q2Fl 


—  F\Q\F\  +  PwQwBl? 


) 


<P(t,  t)t  dt 


+<P(t,  to)P to  )T 


is  to  be  minimized  with  respect  to  L  and  if  subject  to  (8) 
and  that  if  is  a  projector.  By  adding  the  zero  term 

tr  jffC  U«,I)P<<)0(>,<)t  - 

-j 


*  £ 
dr 


[#(/,T)P(T)#(t,T)]dt 


cth 


to  J  and  using  (8),  the  minimization  problem  can  be  rewrit¬ 
ten  as 

mintr 
LJJ 


HC  [  <P(1,t)(L  -  PCJV-')V(L  -  PC7 V~')r 

Jia 


<P(t,  t)t  d  xCJH  +  HCP(t)C7H 


(11) 


subject  to  (8)  and  that  H  is  a  projector  where 
P  =  AP  +  PA7  -  PC7 V~iCP  +  H2Q2Fj 

F iQ\F |  +  BWQWBW  (12) 

and  P(to)  —  Po-  By  inspection,  the  optimal  filter  gain  is 

L*  =  PCTF~I.  (13) 

Since  if  is  a  projector,  it  can  be  written  as  if = ppr  where 
dim  p  -  rank  if  and  p1  p  =  L  By  applying  (1 1 )  to  (13)  and 
substituting  if  —  ppT,  the  minimization  problem  reduces  to 

min  tr [pT CP(t  )CT p] 

p 

subject  to  prp=L  By  using  a  matrix  Lagrange  multiplier  k 
to  adjoin  the  constraint  to  the  cost  criterion,  the  first-order 
necessary  condition  is  obtained  as  Athans  (1968) 

CP(t)CTp  =  pL 

Let  X\  ^  k%  > *  *  *  >  km  be  the  eigenvalues  of  CP(t)CT  and 
Pit  p2t>  *  *  - » pm  ho  the  associated  eigenvectors.  The  solution 
for  the  optimal  p  depends  on  the  rank  of  if.  If  the  rank 
is  chosen  as  one,  the  optimal  p  is  pm  and  the  optimal  pro¬ 
jector  is 

H*=pmp7m.  (14) 

The  minimal  cost  associated  with  (14)  is  km.  Note  that  the 
null  space  of  (14)  is  Im[pi  p2  *  **pm- 1]  because  (14)  can 
be  written  as  H*  =  I  -  [pi  p2  ■  •  •  pm_,]  [px  p2  ■  ■  ■  pm-X]7. 

In  Sections  4  and  5,  it  is  shown  that  CP(i)CT  has  p2  infi¬ 
nite  eigenvalues  in  the  limit  as  y  — ►  0  and  pi  large  eigenval¬ 
ues  near  the  limit  when  y  is  small  where  p%  =  dim  F2.  Since 
the  remaining  m  —  p2  eigenvalues  are  very  small  compared 
to  the  p2  large  eigenvalues  when  y  is  small,  the  rank  of  if 
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can  be  chosen  as  m  —  p2  and  the  optimal  projector  is 

H  —  [Pm  Pm- 1  *  *  *  Pp2+l] 

[pm  Pm- 1  ***  P/?2+l]T*  (15) 

The  minimal  cost  associated  with  (15)  is  ]T/1  P2+\  k-  The 
null  space  of  (15)  is  Im[pi  p2  •  *  *  pP2l  Note  that  both  (14) 
and  (15)  are  optimal  projectors  depending  on  the  rank  cho¬ 
sen.  In  Sections  4  and  5,  it  is  shown  that  Im[pi  p2  *  *  *  pP2 ] 
contains  the  nuisance  fault  completely  in  the  limit  and  par¬ 
tially  near  the  limit.  Thus,  the  null  space  of  H*  only  needs 
to  include  lm[pi  p2  * « *  pPl]  in  order  to  block  the  nuisance 
fault.  Furthermore,  (15)  allows  at  least  as  much  of  the  target 
fault  to  pass  through  as  (14)  because  Im[/?j  p2  *  *  ■  pP2\  C 
Im [pi  Pi  *  *  *  Pm- 1]*  Therefore,  (15)  is  a  better  choice  than 
(14).  In  Section  4,  it  is  shown  that  (15)  becomes  equivalent 
to  the  projector  used  by  the  unknown  input  observer  in  the 
limit. 


=  At'ft  f2Af2 


Remark  1,  To  implement  the  optimal  stochastic  fault  de¬ 
tection  filter,  the  filter  gain  (13)  and  the  projector  (15)  are 
constructed  continuously  with  respect  to  time  because  in  the 
cost  criterion,  t  is  the  current  time. 

Remark  2,  When  Q\  —  0,  the  Riccati  matrix  P  is  posi¬ 
tive  definite.  When  Q\  increases,  P  may  become  indefinite 
(Chen,  2000).  If  Qi  continues  to  increase,  P  may  have  a 
finite  escape  time  and  goes  to  -oo.  This  can  be  shown  by 
formulating  a  linear  quadratic  regulator  problem  as  the  dual 


^"2  =  [ii,o  &i,i  *** 

problem  of  the  optimal  stochastic  fault  detection  filter  prob¬ 
lem  and  using  the  result  in  Speyer  (1986).  This  can  be  in¬ 
terpreted  as  an  attempt  to  make  the  residual  sensitive  to  the 
target  fault.  If  Q\  is  too  large,  the  target  fault  may  destabi¬ 
lize  the  filter.  Therefore,  Q\  has  to  be  chosen  small  enough 
to  avoid  the  finite  escape  time. 


4,  Limiting  case 

In  this  section,  the  limiting  properties  of  the  optimal 
stochastic  fault  detection  filter  are  determined  when  y  — *  0. 
It  is  shown  that  the  filter  places  the  nuisance  fault  into  an  in¬ 
variant  subspace.  For  time-invariant  systems,  this  invariant 
subspace  is  the  minimal  (CfA )-unobservabi lity  subspace  of 
F2.  Therefore,  the  filter  becomes  equivalent  to  the  unknown 
input  observer  in  the  limit.  For  time-varying  systems,  there 
exists  a  similar  invariant  subspace.  Therefore,  the  filter  ex¬ 
tends  the  unknown  input  observer  to  the  time-varying  case. 


In  Section  4.1,  the  geometric  structure  of  the  unknown  in¬ 
put  observer  is  given  (Massoumnia  et  al,  1989;  Chung 
&  Speyer,  1998).  In  Section  4.2,  the  limiting  properties  of 
the  filter  are  determined.  In  Section  4.3,  the  nuisance  fault 
directions  are  generalized  for  time-invariant  systems  to  pre¬ 
vent  the  invariant  zeros  of  the  nuisance  fault  or  their  mirror 
images  from  becoming  part  of  the  eigenvalues  of  the  filter. 
In  Section  4.4,  the  conditions  to  ensure  that  the  target  fault 
can  be  detected  are  discussed. 

4,  L  Geometric  structure  of  unknown  input  observer 

The  unknown  input  observer  places  the  nuisance  fault 
into  the  invariant  subspace  2  which  is  unobservable  to  the 
residual  (Massoumnia  et  al.,  1989).  $~2=W'2®'1/'2m  called 
the  minimal  (C,A )-un observability  subspace  or  the  detec¬ 
tion  space  of  F2  (Massoumnia,  1986).  4F 2  is  the  minimal 
( C,  4  )-in variant  subspace  of  F2  given  by 

•  •  •  Ahf2  •  •  •  fpi  Af  P2  •  •  •  A*nfpil  (16) 


where  /*  is  the  it h  column  of  F2,  Si  is  the  small¬ 
est  non-negative  integer  such  that  CASlfi  f  0  and 
p2  —  dim  F2*  %  is  the  subspace  spanned  by  the  in¬ 

variant  zero  directions  of  (C,^,F2).  Note  that  tf2  is  the 
unobservable  subspace  of  ( HC,A  -  LC)  where  L  is  the 
unknown  input  observer  gain  and  H  is  a  projector  with 
ker  H  =  Im[C4a|/i  CA$2f2  *  *  *  CAh n  f  Pl]  (Massoumnia 
et  al.,  1989).  Therefore,  the  nuisance  fault  is  unobservable 
to  the  residual  that  uses  H  as  the  projector. 

For  time-varying  systems,  the  minimal  ( C,  4 )-in variant 
subspace  of  F2  is  (Chung  &  Speyer,  1998) 

b2t0b2ti  •••  b2Jl  ...  btojobfri  •••  bP2tSp2l  (17) 

The  vectors  bhp  j— 0, 1  *  *  *  <5,*,  are  obtained  from  the  iteration 
defined  by  the  Goh  transformation,  i.e.,  bitJ  —Abij-i  ~bij-\ 
with  bit o  =  fi  where  /*  is  the  rth  column  of  F2  (Bell  & 
Jacobson,  1975;  Moylan  &  Moore,  1971).  is  the  smallest 
non-negative  integer  such  that  Cb^  f  0.  For  time-varying 
systems,  the  minimal  (C?^4)-un observability  subspace  can¬ 
not  be  determined  because  the  concept  of  invariant  zero  is 
for  time-invariant  systems  only.  The  time-varying  extension 
of  H  is  ker  H  —  Im[Cbi^  Cb2j2  *  *  *  CbP2$_  ]  (Chung  & 
Speyer,  1998). 

Remark  3.  Eqs,  (16)  and  (17)  produce  the  correct  invariant 
subspaces  only  when  rank  C4fr 2  =  p2.  If  rank  CHr2  <  p2t  a 
new  basis  for  F2  can  be  obtained  such  that  rank  C4P' 2~  Pi 
(Chen,  2000;  Chen  &  Speyer,  2002). 

4.2.  Limiting  property 

In  this  section,  it  is  assumed  that  the  Riccati  matrix  P 
is  positive  definite.  From  Remark  2,  there  always  exists 
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positive  definite  P  for  some  Q\ .  Then,  P  can  be  written  as 

p=j2KlpiPl 

*'=  i 

where  If1  is  the  ith  eigenvalue  of  P  and  p(  is  the  associated 
eigenvector.  In  the  limit  as  y  — >  0,  P  goes  to  infinity  because 
of  the  term  (l/y)F2Q2Fj  in  (12)  which  indicates  that  some 
A/’s  go  to  zero.  Define 

n  4  p~l  =  EW 

i=i 

Then,  P  goes  to  infinity  in  the  limit  along  the  null  space  of 
JL  By  using 

and  (12), 

-n  =  nA+  Arn  +  n  (^f2q2fJ  -  f,  e,Ff 

n-cTr~lc,  (18) 

where  Tl(t0)  —  P^K  Define 

JT  A  limH. 
y-0 

In  the  limit,  in  order  for  (18)  to  have  a  solution, 

Hf2  =  0.  *  (19) 

This  indicates  that  17  has  a  null  space  which  includes  F2. 
It  turns  out  that  ker  17  is  the  key  to  blocking  the  nuisance 
fault.  Theorem  4  shows  that  ker  17  is  a  ( C,  4  )-in  variant  sub¬ 
space.  Therefore,  the  optimal  stochastic  fault  detection  filter 
places  the  nuisance  fault  into  an  invariant  subspace  in  the 
limit.  Theorem  5  shows  that  ker  17  also  includes  the  mini¬ 
mal  (  C,  4  )-in  variant  subspace  of  F2. 

Theorem  4.  ker  17  is  a  (C,A)-invariant  subspace. 


+  BwQwBw 


Proof.  When  only  the  nuisance  fault  occurs,  the  dynamic 
equation  of  the  error  (6)  can  be  written  as 

He  =  (HA  -  CTV~lC)e  +  TIF2p2. 

By  adding  fie  to  both  sides  and  using  (18), 


dt 


(I7e)  = 


G 


A1  +n\-F2Q2F}-FlQxF{ 


$wQ\ 


>  Br\ 

WDW  J 


lie  +  FIF2p2. 


(20) 


In  the  limit,  if  the  error  initially  lies  in  ker  17,  (20)  implies 
that  the  error  will  never  leave  ker  17  because  of  (19).  There¬ 
fore,  kerH  is  a  ( C,  4  )-in variant  subspace. 


Theorems,  kern  includes  the  minimal  (QA)- invariant 
subspace  of  F2. 

Proof.  Consider  the  time-varying  case  first  where  W2  is 
given  by  (17),  From  (19),  nbiiQ  =  0  and  nbuo  =  -fibU(h 
In  the  limit,  by  multiplying  (18)  by  5j0  from  the  left  and 
from  the  right,  and  using  nb\)0  —  0, 

7  nF2Q2F]nbl0  =  0.  (21) 

By  using  fib^o  =  0,  (18)  and  (21), 

m>i.i  =  J7 (Abho  -  K o)  =  C1V~lCbit0  =  0. 

From  fib  14  —  0,  it  can  be  shown  similarly  that  flbif2  =  0, 
By  iterating  this  procedure,  17  [7  M  614  7^]  =  0.  It 

can  be  shown  similarly  that  17[7I;q  7/j  * « *  b^j.]  —  0  for 
i  =  2, 3, . . . ,  p2.  Therefore,  fliF 2=0.  For  the  time-invariant 
case,  it  can  be  shown  similarly. 

Whether  ker  17  includes  the  invariant  zero  directions  of 
(C,^,F2)  for  time-invariant  systems  is  considered  now.  If 
ker  17  does  not  include  the  invariant  zero  directions,  the 
invariant  zeros  will  become  part  of  the  filter  eigenvalues 
(i.e.,  the  eigenvalues  of  A  -  LC)  (Massoumnia,  1986). 
By  using  the  result  in  Kwakemaak  (1976),  if  there  exist 
left-half-plane  invariant  zeros,  part  of  the  filter  eigenvalues 
will  be  at  the  invariant  zeros  in  the  limit.  If  there  exist 
right-half-plane  invariant  zeros,  part  of  the  filter  eigenval¬ 
ues  will  be  at  the  mirror  images  of  the  invariant  zeros  in 
the  limit.  Therefore,  ker  17  includes  the  invariant  zero  di¬ 
rections  associated  with  the  right-half-plane  invariant  zeros, 
but  not  necessarily  the  invariant  zero  directions  associated 
with  the  left-half-plane  invariant  zeros.  In  Section  4.3,  the 
nuisance  fault  directions  are  generalized  such  that  ker  17  in¬ 
cludes  all  the  invariant  zero  directions.  This  generalization 
prevents  the  invariant  zeros  or  their  mirror  images  from 
becoming  part  of  the  filter  eigenvalues.  This  is  important 
because  the  invariant  zeros  or  their  mirror  images  might 
be  ill-conditioned  even  though  they  are  in  the  left-half 
plane. 

For  time-invariant  systems,  ker  17  3  2  from  Theo¬ 

rem  5  and  ker  7  2  f2  from  the  generalization  of  the 
nuisance  fault  directions.  Thus,  ker  fl  3  AT2.  By  using 
the  result  in  Chung  and  Speyer  (1998)  and  Chen  and 
Speyer  (2000),  ker  17  C  2T Therefore,  ker  17  is  equiva¬ 
lent  to  the  minimal  (QA  )-unobservability  subspace  of  F2 
and  the  optimal  stochastic  fault  detection  filter  becomes 
equivalent  to  the  unknown  input  observer  in  the  limit.  For 
time-varying  systems,  ker  17  3  iir 2  from  Theorem  5.  By 
using  the  result  in  Chen  and  Speyer  (2000),  ker  17  is  in 
the  unobservable  subspace  of  (HCtA  —  LC),  Therefore, 
the  optimal  stochastic  fault  detection  filter  places  the  nui¬ 
sance  fault  into  a  similar  invariant  subspace  in  the  limit  and 
extends  the  unknown  input  observer  to  the  time-varying 
case. 
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Remark  6,  By  using  the  optimal  filter  gain  (13)  and  optimal 
projector  ( 1 5  ),  the  minimization  problem  (10)  can  be  written 
as 


tr{E[/i2(OA2(f)T]}  +  ytr{E[^(0/i3(0T]} 

tr{E[A,(r)ft,(0T]} 


=  M  1  + 


e* 


P2±l 


*4 


tr{B[A,(0A|(0T]} 


In  the  limit  as  y  — ►  0, 
tr{E[A2(f)A2(r)T]} 

tr{E[A,(0A,(0T]} 


This  implies  that  the  nuisance  fault  transmission  is  zero  in 
the  limit 


Remark  7,  Since  P  goes  to  infinity  in  the  limit  along  ker  #, 
CPC1  goes  to  infinity  along  Cker  #.  For  time-invariant 
systems,  Cker  17  =  lm[CA^ft  CAhf2-*CABnfp2\  For 
time-varying  systems,  C  ker  #=Im[CbMl  Cb2is2  *  *  * 

Then,  CPC1  has  p2  infinite  eigenvalues  in  the  limit  and 
their  associated  eigenvectors  span  Cker #.  Therefore,  the 
optimal  projector  (15)  becomes  equivalent  to  H ,  which  is 
used  by  the  unknown  input  observer,  in  the  limit. 


4 A.  Condition  on  target  fault  detection 

In  this  section,  two  conditions  to  ensure  that  the  target 
fault  can  be  detected  are  assumed.  First,  Fx  and  ker  17  are 
independent,  i.e.,  Fx  H  ker  17  =  0.  Otherwise,  the  target  fault 
will  be  difficult  or  impossible  to  detect  because  it  will  be 
blocked  from  the  residual  along  with  the  nuisance  fault  even 
though  the  filter  can  still  be  derived  by  solving  the  minimiza¬ 
tion  problem.  This  condition  is  similar  to  but  less  restrictive 
than  the  output  separability  condition  in  Massoumnia  et  al. 
(1989)  and  Chung  and  Speyer  (1998),  i.e.,  C^,nC7C2=0 
where  if  j  is  the  minimal  (C,  4  )-in  variant  subspace  of  F\ 
which  can  be  obtained  similarly  by  using  (16)  or  (17). 
The  output  separability  condition  is  more  restrictive  be¬ 
cause  there  is  an  invariant  subspace  formed  for  the  target 
fault. 

For  time-invariant  systems,  to  further  ensure  a  nonzero 
residual  in  steady  state  when  the  target  fault  occurs, 
{C,A,F{)  cannot  have  invariant  zeros  at  the  origin.  When 
only  the  target  fault  occurs,  the  dynamic  equation  of  the 
error  (6)  and  the  residual  without  the  projector  can  be 
written  as 

e  =  (A  —  LC)e  4-  Fi/q, 
r  —  Ce. 


4.3.  Generalization  of  nuisance  fault  direction 

The  invariant  zero  of  (CtAfF2)  is  defined  as  z  at  which 
'  zl  —  A  F2 

c  0 

loses  rank.  The  invariant  zero  direction  v  is  formed  from  a 
partitioning  of  the  null  space  as 

~zl  —  A  F2 

C  0 

From  Section  4.2,  when  //,  a  column  vector  of  F2, 
has  a  left-half-plane  invariant  zero  zh  ker/7  includes 
Im[/i*  Aft  *  *  *  Adlfi],  but  not  Im  vf  where  v,*  is  the  invariant 
zero  direction.  Also,  z*  becomes  one  of  the  filter  eigenval¬ 
ues  in  the  limit.  If  the  nuisance  fault  direction  /*  is  replaced 
by  V|,  zi  will  not  become  one  of  the  filter  eigenvalues. 
Furthermore,  since  ker#  includes  Im[v;  Avx  *«*  45i+1V/] 
which  is  equivalent  to  I m[/*  Af(  *  *  *  ASi ft  v,]  by  using 
(22),  this  generalization  will  still  block  the  nuisance  fault. 
Note  that  ker  #  includes  the  invariant  zero  direction  now. 
If  the  invariant  zero  is  in  the  right-half  plane,  this  gener¬ 
alization  prevents  the  mirror  image  of  the  invariant  zero 
from  becoming  one  of  the  filter  eigenvalues  in  the  limit. 
If  (C,4,  v,)  has  invariant  zeros,  the  same  procedure  can  be 
repeated.  If  the  invariant  zero  is  associated  with  not  just 
one,  but  several  column  vectors  of  F2i  only  one  of  these 
vectors  needs  to  be  replaced  by  the  invariant  zero  direction. 


For  a  bias  target  fault,  the  residual  is  zero  in  steady  state  if 
(C,A  —  £C,Fj)  has  an  invariant  zero  at  the  origin  (Chen, 
1984).  Since  the  filter  gain  L  does  not  change  the  in¬ 
variant  zero,  (C,A  —  LC, F\ )  has  an  invariant  zero  at  the 
origin  if  and  only  if  (C,4,Fj )  has  an  invariant  zero  at  the 
origin. 


5,  Perturbation  analysis 

In  Section  4.2,  the  limiting  properties  of  the  Riccati  ma¬ 
trices  17  and  P  were  determined.  In  this  section,  expressions 
for  77  and  P  in  the  limit  and  near  the  limit  are  developed  us¬ 
ing  a  perturbation  method.  The  asymptotic  expansions  of  77 
and  P,  explicitly  expressed  as  functions  of  y,  give  an  under¬ 
standing  of  77  and  P  when  y  is  small  which  is  the  region  of 
interest  for  the  filter  design.  In  Chen  and  Speyer  (2000)  and 
Chung  and  Speyer  (1998),  the  Goh  transformation  in  singu¬ 
lar  optimal  control  theory  (Bell  &  Jacobson,  1975;  Moylan 
&  Moore,  1971)  is  used  to  detennine  77  in  the  limit.  How¬ 
ever,  the  Goh  transformation  cannot  determine  77  near  the 
limit.  In  Section  5.1, 77  is  expanded  around  y=0.  This  shows 
explicitly  the  characteristics  of  77  near  and  in  the  limit.  It 
is  shown  that  the  limiting  77  determined  from  the  perturba¬ 
tion  method  is  the  same  as  that  determined  from  the  Goh 
transformation.  In  Section  5.2,  the  inverse  of  77  is  derived. 
This  shows  explicitly  the  characteristics  of  P  near  and  in  the 
limit.  The  limiting  result  is  consistent  with  and  generalizes 
the  result  of  Kwakemaak  and  Sivan  (1972). 
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5.1.  Asymptotic  expansion 

In  this  section,  II  is  expanded  around  y  —  0  as 

oo 

J7  =  ]T)y4  nh 


Qn  Qn 
Qn  Qn 


Ru  Ru 

^12  Rn 


(FiQiFj  ~  BwQwBl)  [«,  u2] 


C’V-'Cta,  u2] 


By  substituting  (23)  into  (18)  and  collecting  terms  of  com¬ 
mon  power,  the  equations  used  for  finding  the  il/’s  in  (23)  The  equations  for  the  higher-order  terms  can  be  found  in 

are  obtained  in  Lemma  8,  Chen  (2000)* 

Lemma  8*  Proof,  See  Appendix  A.  □ 

('00]  f0  0  ] 

4-  yVA  In  Lemma  9,  the  solution  of  (24)  and  (25)  is  discussed 

0  Hq22  _0  J7122  when  CF2  f  0,  In  Lemma  10,  the  solution  is  discussed  when 

CF2  =0  and  C(AF2  —  F2)  0.  The  higher-order  cases,  such 

Tin i  H2i2  lint  H312  \  as  CF2  —  C(AF2  —F2)=0  and  C[A(AF2  ~F2)— d/dx(AF2  — 

+  y ;  T  +  y  J  +•••  1  F2)]  f  o,  can  be  considered  similarly, 

„TI 212  1*222  J  L ^312  “322  J  / 

T  Lemma  9.  When  CF2  ^0, 


Lemma  8, 


n  =  [ui  u2] 


0  0 

0  fin? 


H211  U212 

H212  TI222 


0  0 
0  ill  t 


lint  JI312 
#312  U222 


n=[u\  u2] 


0  0 


F2Q2F2  =  [«l  M2] 


cr  0]  I’m] 


0  oj  L«I 


=  U\(TuJ 


a  >  0  amf  [u\  u2J  is  unitary.  Note  that  Im  u\  =ImF2.  I7022, 
i72ii  and  n2n  must  satisfy: 

0  =  n2iian2ii  —  R\u  (24a) 


.  ^212  njl2ii2l\n2\2  4-  ii 222 


0  =  n2i\all2i2  +  ^|iI7o22  —  Fj2, 


~  TI  022  =  TI022A22  +  Aj2n022  —  Uq22 Q22 -ff 022  “  R22 

+  Ul2<rII2i2, 

TI  122*  II 311  and  ii 312  must  satisfy : 

0  =  Ilm&IIm  4  nmellnu 


0  =  I73ii€ri72i2  4  il2U^il3l2  4v4|li7l22} 


—  TI  122  =  TA.122iA.22  —  Q22TI022)  +  (^22  —  Q22  IIq22  H 122 
+  ^312^212  +  ^212^312*  (25c 


A\\  A 12 
^21  ^22 


A[Ut  u2]~  T  [«!  U2] 


—  Hq22(A22  —  A2iRnlRn)  +  (A22  —  A2\Rul Rn)T IIq22 
■F  no22(A2iRul  Ajt  —  Q22W022 
-(Rii-RhRTiRn),  (27a) 

U211  =  Rf(RfaRfrli2Rf,  (27b) 

772i2  =  cr  ln2l\(R\2  —  Aj1IIo22)f  (27c) 

H222  —  tt222{-a22  4  Q22H022  4  A^n^n  2i2) 

+  (-A22  +  Q22H022  +  ^2if7J11iiI212)TI7222.  (27d) 


Proof.  See  Appendix  B.  □ 


J  L  J  L  2  J  Lemma  10,  When  CF2  =  0  and  C(AF2  -  F2)  f  0, 

yV4llni  +  y  *  *  *  ymn2m  4  ym  *  *  *  yl/2l72i22  4  y3/4  *  *  *  1  T  uj  * 

17  =  [mi  M2t?j  m2o2]  yl/277j121  4  y3/4  *  * «  Tl/4I7122ii  +  y1/2  •*  *  y1/4f7,2212  4  yl/2  *  *  *  vjuj  ,  (28) 

_yI/277jl22  4  y3^4  *  *  *  4  y1^  •  •  I702222  4y^.  JL 
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where  Imt?i  =  Im^i  and  [vi  v2]  is  unitary .  Only  the 
lowest-order  term  of  each  element  is  kept  for  simplicity . 
The  equation  for  each  term  can  be  found  in  Appendix  C 
and  Chen  (2000), 

Proof,  See  Appendix  C.  □ 

When  CF2  f  0 ,  from  Lemma  9, 


*0  0 

r»n 

_  0  77q22 

in  the  limit.  Therefore,  ker  17  D  Im  u\  =  Im  F2  =  iF 2  which 
is  consistent  with  Theorem  5. 

Since  the  Riccati  equation  (18)  can  also  be  generated  by 
solving  a  differential  game  similar  to  the  one  in  Chen  and 
Speyer  (2000),  the  result  of  (26)  gives  insight  into  the  sin¬ 
gular  differential  games  in  Chung  and  Speyer  (1998)  and 
Chen  and  Speyer  (2000),  A  singular  differential  game  simi¬ 
lar  to  the  one  in  Chen  and  Speyer  (2000)  is  formulated  and 
solved  by  using  the  Goh  transformation  to  derive  the  limit 


in  the  limit.  By  using  Im  vt  =  Im^21  and  Im  u{  =  ImF2, 

Im [ui  u2v i]  =  Im|>i  u2uJ(Aui  -  m  i)] 

=  Im  [ui  (I  -  uiuj  )(Au\  ~Mj)] 

=  Im|>i  Aux  -  u  i]  =  Im[F2  AF2  -  F2] 

Therefore,  ker  IT  Dim  [F2  AF2  -  F2]  =  if 2  which  is  con¬ 
sistent  with  Theorem  5, 

5.2,  Analysis 

In  this  section,  an  expression  for  the  inverse  of  17  is  de¬ 
rived.  This  shows  explicitly  the  characteristics  of  P  near  and 
in  the  limit.  Only  time-invariant  systems  are  considered  be¬ 
cause  I7g22  in  (29)  and  77o2222  in  (3 1 )  may  not  be  invertible 
for  time-varying  systems.  In  Lemma  12,  P  is  determined 
when  CF2  f  0.  In  Lemma  13,  P  is  determined  when  CF2  =0 
and  CAF2  f  0.  The  higher-order  cases,  such  as  CF2  = 
CAF2  —  0  and  CA2F2  ^  0,  can  be  considered  similarly. 

Lemma  12.  When  CF2  ^  0, 


P  =  [ux  u2]  (  y  1/2 


nh\ 


^211  (-^212^022^212  -^41 1  2j  | 

— ^022^212^211 


-772111J72i2l7j 


n\ 


1 

022 


-1 

022 


+ 

k> 

i - 

* 

i . 

J  /  i 

iU2  J 

(32) 


of  (18)  when  CF2  f  0  which  is 
-  S  =  SA  +  AJS  +  S[Bx(Fj CTV~lCF2TlB ]  -  FxQtFj 

+BwQwBl]S  -  CrHTV~lHC ,  (30) 


Proof.  By  using  Lemma  9  and  matrix  inversion  lemma, 
(32)  is  obtained. 

Lemma  13.  When  CF2  —  0  and  CAF2  ^  0, 


p  =  [Ul  u2v  x 


u2v2] 


’  y  3/4Fh  +  y  1/2  *  •  •  y“'1/2Fi2  +  f 1/4  *  * 

r1/2Fj 2+y‘1/4-*  r1/4F2  2+- 

.  Pl  +  yl/4-  + 


P.3 +  7 ''<■■■- 

'  «T  ■ 

p23+yW... 

v]uj 

P33+7,/4---. 

v2uj_ 

(33) 


where  A=  A  -  F1(FjCTF“lCF2r1FjCTF^1C,  S,  = 
AF2  -F2m&H  =  I-  CF2(Fj Ct F" 1 CF2 )~lFjCTV~K 
Theorem  1 1  shows  that  the  limiting  Riccati  matrix  determi¬ 
ned  from  the  perturbation  method  is  the  same  as  that  deter¬ 
mined  from  the  Goh  transformation. 


where  PiJt  i J  =  1 , . . . ,  3,  can  be  found  in  Chen  (2000 ).  Only 
the  lowest-order  term  of  each  element  is  kept for  simplicity. 

Proof.  By  using  Lemma  10  and  matrix  inversion  lemma, 
(33)  is  obtained  (Chen,  2000). 


Theorem  11. 


[**i  u2] 


’o  0 

r  t  i 

u[ 

o  n022 

T 

L«2  j 

=  £ 


In  the  limit,  when  CF2  0,  Lemma  12  shows  that  P  goes 
to  infinity  along  the  direction  of  ImF2.  In  the  limit,  when 
CF2  =  0  and  CAF2  f  0,  Lemma  13  shows  that  P  goes  to 
infinity  along  the  direction  of  Im[F2  AF2], 


Proof.  See  Appendix  D,  □ 

When  CF2  =  0  and  C(AF2  —  F2)  ^  0,  from  Lemma  10, 


o 

o 

© 

T  1 

0  0  0 

T  T 

nn 

_  0  0  I7o2222  _ 

T  T 

L»2“2  J 

Remark  14.  By  using  the  result  in  Kwakemaak  and  Sivan 
(1972),  for  the  time-invariant  and  infinite-time  case,  under 
the  assumption  that  (CfAfF2)  does  not  have  right-half-plane 
invariant  zeros, 

yP  — ►  0 

L  _  y-'l2F2Q'1/2UTV-'12 


n  —  [«,  u2v  1  u2v2 ] 


(34a) 

(34b) 
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as  y  — >  0  where  U  is  an  arbitrary  matrix  such  that 
UTU  =  I. 

To  compare  this  result  with  Lemmas  12  and  13,  (34a)  is 
satisfied  by  multiplying  (32)  and  (33)  by  y.  By  substituting 

(32)  into  (13), 

L  -*  y~V2ui i7£j*  uJCTV~l 

as  y— >0.  Therefore,  L  goes  to  infinity  along  the  direction  of 
y_1/2ImF2  which  is  consistent  with  (34b).  By  substituting 

(33)  into  (13), 

L  y^^uiPnvJuJC1  +  y~ l^u2ViP22v] uJCTV~l 

as  y  -*  0.  Therefore,  L  goes  to  infinity  essentially  along 
the  direction  of  y“^2ImF2  which  is  consistent  with  (34b). 
However,  L  also  goes  to  infinity  along  the  direction  of 
y~1/4Im  u2v i  where  Im  [F2  u2v i]  =  Im  [F2  AF2].  Therefore, 
the  perturbation  method  is  consistent  with  and  generalizes 
the  result  of  Kwakemaak  and  Si  van  (1972). 


6,  Example 

In  this  section,  two  numerical  examples  are  used  to 
demonstrate  the  performance  of  the  optimal  stochastic  fault 
detection  filter.  In  Section  6.1,  the  filter  is  applied  to  a 
time-invariant  system.  In  Section  6.2,  the  filter  is  applied  to 
a  time-varying  system. 


6.1.  Example  1 


Consider  the  time-invariant  system  from  White  and 
Speyer  (1987), 


A  = 


0  3 
1  2 
0  2 


4 
3 

5 


'0  1  0 
0  0  1 


"O' 

‘5" 

Fi  = 

0 

.  F2  = 

1 

.1. 

.1. 

where  F\  is  the  target  fault  direction  and  F2  is  the  nuisance 
fault  direction.  There  is  no  process  noise.  To  determine  the 
optimal  stochastic  fault  detection  filter,  the  power  spectral 
densities  are  chosen  as  Q\  —  1,  Q2  =  1  and  V  =  /.  The 
steady-state  solutions  to  the  Riccati  equation  (12)  when 
y  —  10“4  and  10-6  are  obtained,  respectively.  Fig.  1  shows 
the  frequency  response  from  both  faults  to  residual  (5). 
The  left  one  is  obtained  with  y  —  10“4  and  the  right  one  is 
obtained  with  y  —  10“6.  In  each  figure,  there  are  two  solid 
lines  representing  the  frequency  response  from  the  target 
fault  to  the  residuals  using  projectors  (15)  and  H>  respec¬ 
tively.  Note  that  these  two  solid  lines  overlap.  The  dashdot 
line  and  dashed  line  represent  the  frequency  response  from 
the  nuisance  fault  to  the  residuals  using  projectors  (15) 


y^Kf4 


y=  IO'6 


Fig.  1.  Frequency  response  from  both  faults  to  the  residual. 
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and  Ht  respectively.  This  example  shows  that  the  nuisance 
fault  transmission  can  be  reduced  by  using  a  smaller  y  while 
the  target  fault  transmission  remains  large.  Furthermore,  the 
projector  (15),  derived  from  solving  the  minimization  prob¬ 
lem,  is  slightly  better  than  H,  die  projector  used  by  other 
approximate  unknown  input  observers  (Chung  Sc  Speyer, 
1998;  Chen  Sc  Speyer,  2000),  at  low  frequency.  This  sug¬ 
gests  that  H  might  not  be  the  best  choice  for  the  approximate 
unknown  input  observer. 


norm  of  residual  (5)  using  projector  (15)  when  there  is  no 
fault,  a  target  fault  and  a  nuisance  fault,  respectively.  The 
faults  are  steps  of  magnitude  3  that  occur  at  the  fifth  second. 
The  sensor  noise  is  a  zero  mean,  white  Gaussian  noise  with 
power  spectral  density  of  10 ~4I.  This  example  shows  that 
the  residual  is  very  sensitive  to  the  target  fault  and  much  less 
sensitive  to  the  nuisance  fault.  Therefore,  the  filter  performs 
well  for  time-varying  systems. 


6.2.  Example  2 


7,  Conclusion 


Consider  a  time- varying  system  obtained  by  adding  some 
time-varying  elements  to  the  time-invariant  system  in  pre¬ 
vious  section, 


— cos(t)  3+2  sin(£)  4 


A=  1 


5  sin  (t) 

5  —  2cos(t) 


^2=  1 


2 

2 


1  +  sin  (f ) 


3—2  cos  (t)  , 
5  +  3  cos  (t) 


while  C  and  F\  remain  the  same.  The  Riccati  equation  (12) 
is  solved  with  Q\  —  1,  Q2  =  1,  V  =  /,  P0  =  /  and  y  = 
10~4  for  t  €  [0,25].  Fig,  2  shows  the  time  response  of  the 


The  optimal  stochastic  fault  detection  filter  is  derived  from 
solving  a  stochastic  minimization  problem.  In  the  limit,  the 
filter  recovers  the  geometric  structure  of  the  unknown  input 
observer  and  the  nuisance  fault  is  completely  blocked.  When 
it  is  not  at  the  limit,  the  filter  is  an  approximate  unknown  in¬ 
put  observer  and  the  nuisance  fault  is  partially  blocked.  The 
perturbation  method  used  to  obtain  the  limiting  and  asymp¬ 
totic  behaviors  of  the  filter  can  be  applied  to  other  approx¬ 
imate  unknown  input  observers  (Chung  &  Speyer,  1998; 
Chen  Sc  Speyer,  2000)  derived  by  solving  differential  games 
which  consider  the  worst-case  scenarios.  For  time-invariant 
systems,  the  filter  performance  can  be  enhanced  by  replac¬ 
ing  the  nuisance  fault  directions  with  the  invariant  zero 
directions.  This  notion  can  also  be  applied  to  other  approxi¬ 
mate  unknown  input  observers.  Finally,  filter  designs  can  be 
obtained  for  both  time-invariant  and  time-varying  systems. 
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Appendix  A.  Proof  of  Lemma  8 


By  substituting  (23)  into  (18)  and  collecting  terms  of 
common  power, 


1  1  :  0  =  IIoQ2n 0, 

(A.  la) 

~3!4  :Q  =  nlQ2n0  +  n0Q2nu 

(A.lb) 

~1/2 : 0  =  n2Q2n0  +  nxQ2nx  +  n0Q2n2, 

(A.lc) 

~1/4  :o=ihQ2n0  +  n2Q2nx 

4'77i02772  4*  77o027/3, 

(A,  Id) 

y° :  -n0  =  n0A + Arn0  -  cTv~’c  +  n4g2n0 

+  H3Q2.II  i  4-  77202//2 

+  H\Q2II2  4-  TI^QiIl^  —  HoQiIlo ,  (A.le) 

yl/ 4 :  -ill  =  ntA  +  Arnx  +  n5Q2n0  +  n4Q2nx 

+  /7302772  4*  H2Q2II^  +  n% Q2H^ 


4- ilo^ils  “  17i0iilo  —  J7o0il71}  (A,lf) 

where  Q2  =  F2Q2Fj  and  Qx  =FlQlFj  -  BwQwBl. 

From  (A,  la),  TIq  can  be  written  as 

r°  °  i  r^Ti  T 

77o  —  [u\  u2]  —  u2IlQ22u2t  (A,2) 

.0  17022j  lul2 

where  n022  is  to  be  determined,  (A,  lb)  is  trivially  satisfied 
because  of  (A,2).  By  substituting  (A,2)  into  (A,lc),  II \  can 
be  written  as 

[0  0  1  [«{] 

77i=[mi  u2]  T  —  «2/7i22«2>  (A3) 

.0  77l22J  [ u 

where  H122  is  to  be  determined,  (A. Id)  is  trivially  satisfied 
because  of  (A.2)  and  (A3), 

Let 


n2  =  [ut  u2] 


772|2 

1/222 


(A.4) 


By  multiplying  (A,le)  by  [u%  u2]r  from  the  left  and  [ux  u2] 
from  the  right,  and  substituting  (A,2),  (A3)  and  (A.4),  (24) 
is  obtained.  Let 


7/311  nn2  uj 

_  njl2  H322  _  uj 


(A,5) 


By  multiplying  (A,  If)  by  [ux  u2f  from  the  left  and  [ux  u2] 
from  the  right,  and  substituting  (A,2),  (A3),  (A.4)  and 
(A,5),  (25)  is  obtained.  The  same  procedure  can  be  used 
to  obtain  the  equations  for  the  higher-order  terms  if  needed 
(Chen,  2000;  Chen,  Mingori,  &  Speyer,  2001). 

Remark  15.  Since  Imtq  =  Im  F2,  u\  can  be  chosen  as 
F2(FjF2)_1/2.  Since  [ux  u2]  is  unitary,  u2  has  to  satisfy 
uju2  =  0  and  uju2  —  L  Define  Ui  =  I  —  uxuj.  Since 
uJUi  =  0,  the  first  column  of  u2i  called  u2 1,  can  be  cho¬ 
sen  as  u2\  =  UXi{UlUu)~-V2  where  Uy  is  any  nonzero 
column  of  U\.  Next,  define  U2  =  I  —  [u%  u2i][ui  u2 i]T. 
Then,  the  second  column  of  u2t  called  u22,  can  be  chosen  as 
W22  -  U2i{UlU2i)~V2  where  U2i  is  any  nonzero  column  of 
C/2.  Other  directions  of  u2  can  be  obtained  similarly,  u  1  and 
u2  can  also  be  obtained  since  u\  and  u2  are  explicitly  writ¬ 
ten  as  functions  of  time.  For  time-invariant  systems,  [ii\  u2] 
can  also  be  obtained  from  the  singular  value  decomposition 
of  F202Fj  and  [u  \  u2]  =  0.  Note  that  [u%  u2]  is  generally 
not  unique.  However,  the  theorem  and  all  lemmas  in  Sec¬ 
tion  5  are  true  for  any  [ux  u2]  satisfying  Im  u\  ~  ImF2  and 
[u\  u2]  is  unitary. 


Appendix  B,  Proof  of  Lemma  9 

When  CF2  i  0,  R\\  is  positive  definite  because  Im U\  = 
ImF2,  Then,  from  (24a),  (27b)  is  obtained.  Note  that  I72n 
is  positive  definite.  From  (24b),  (27c)  is  obtained.  By  sub¬ 
stituting  (27c)  into  (24c)  and  using  (24a),  (27a)  is  obtained. 
Therefore,  die  zeroth-order  term  ilo  (A.2)  can  be  obtained 
from  (27a).  Part  of  the  second-order  term  JI2  (A.4)  can  be 
obtained  from  (27b)  and  (27c). 

From  (25a),  U311  —  0  because  <j  and  U2u  are  positive 
definite.  By  substituting  7/311=0  into  (25b), 

77312  =  ^n^Aln^  (B.l) 

By  substituting  (B.l)  into  (25c), 

77i22  =77122(^22  4-  022 77o22  4- ^21/7^1/7212) 

+  (“^22  +  022T7o22  +  ^21  ^211 77212  )T77l22«  (®*2) 

Since  (B.2)  is  a  homogeneous  equation  and  the  initial  con¬ 
dition  is  zero,  il  122  =  0,  By  substituting  77i22  =  0  into  (B.l ), 
I/312  —  0*  Therefore,  the  first-order  term  III  (A3)  and  part 
of  the  third-order  term  7/3  (A.5)  am  zero.  Similar  procedure 
can  be  used  to  obtain  (27d)  (Chen,  2000;  Chen  et  al,,  2001). 
Therefore,  the  second-order  term  772  (A.4)  can  be  obtained 
from  (27b),  (27c)  and  (27d).  Similar  procedure  can  be  used 
to  obtain  the  equations  for  the  higher-order  terms  if  needed 
(Chen,  2000;  Chen  et  al,,  2001),  It  can  be  shown  that  the 
rest  of  the  odd  terms  (i.e,,  7/3, 7J5,..,)  are  zero.  Therefore, 
when  CF2  ^  0,  the  expansion  of  7/  (23 )  only  needs  to  be 
in  the  order  of  y1^2. 


7/3  =  [ut  u2] 
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Remark  16,  Since  JJai  i  and  II 212  are  obtained  from  alge¬ 
braic  equations  (27b)  and  (27 c),  the  initial  condition  I7(f0)= 
Pq1  cannot  be  satisfied  in  general.  This  is  because  the  di¬ 
mension  of  the  Riccati  equation  (18)  is  reduced  in  the  limit 
as  y  — *  0  which  leads  to  the  occurrence  of  a  boundary  layer 
(Nayfeh,  1973).  The  expansion  of  il  (26)  is  called  the  outer 
expansion  and  valid  everywhere  except  near  t=0.  The  inner 
expansion,  which  is  valid  only  near  %  —  0,  can  be  obtained 
by  using  different  fast  time  scales  (Nayfeh,  1973).  Since  the 
inner  expansion  is  only  valid  for  a  very  short  period  of  time, 
only  the  boundary  layer  is  obtained  and  used  as  the  initial 
condition  of  the  outer  expansion  (Chen  et  ah,  2001).  Note 
that  in  the  limit,  the  fast  time  scale  goes  to  infinity  and  there 
is  an  instant  jump  at  the  initial  time  which  is  consistent  with 
the  Goh  transformation  (Chen  8c  Speyer,  2000;  Chung  8c 
Speyer,  1998). 


Appendix  C.  Proof  of  Lemma  10 

When  CF2 = 0,  R\ %  —  0  and  R\2  =  0  because  Im  ux  — ImF2. 
From  (24a),  il2n  =  0  because  a  is  positive  definite.  By 
substituting  I72n  =  0  into  (24b),  n022A2X  -  0.  Then,  U022 
can  be  written  as 


G2211  Q2212 


F2211  P2212 


—  j  Q22[V\  V2\ 


^22[ni  v2]. 


Since  Im ux  =  ImF2,  Cux  =  0  and  C(Aux  -  ux)  ^  0. 
Since  R2m  =  vJuJCTV~lCu2vx  and  lmvx  =  Im+2i,  R2211 
is  positive  definite  because 

Ar2luT2CrV-lCu2A2l 


=  (u]Atu 2  -  u\ u2 )ut2 CTV~l Cu2(ujAux  -  ujux) 

=  (u]At  -  u\){I  -  uxuJ)CrV-lC(I  -  umJ^Arn  -  ux) 

=  (Aux  -  Ui)TCTV-lC(Aux  -  ux)  >  0 
Then,  n2m  is  invertible.  From  (C.3b), 

112122  =  <?  ^2121(^2212  “  ^2221^02222)*  (C.4) 

By  substituting  (C.4)  into  (C.3e)  and  using  (C.3a), 

-  02222  =  H02222(A2222  ~  A222lR22n^22l2) 

+  (+2222  —  *^2221  ^221 1*^2212  )Tf? 02222 


(C.l) 


(C.2) 


+  ^02222(^2221^2211^2221  “  G2222W02222 

(R2222  ~  ^2212^2211^2212).  (C.5) 

Therefore,  the  zeroth-order  term  I70  (A.2)  can  be  obtained 
from  (C.l)  and  (C.5).  Similar  procedure  can  be  used  to 
obtain  the  equations  for  other  terms  (Chen,  2000;  Chen 
et  al.,  2001).  Therefore,  U  can  be  expressed  as 


17  =  [Ul  u2 ] 


yV4nm  +  ?••■ 

y'/2njl2  +  yi/4  •  •  ■  [Vl  v2] 


yl,2n212  +  y3'A--. 

yl/4nmn  +  yl/2  ■  ■  ■  yl/4nl2212  +  y ^  ■ 
yi/l4nj22\i  +  y1*2  ■  •  •  n  02222  +  y^4  •  •  • 


By  multiplying  (24c)  by  [uj  i.’;]1  from  the  left  and  [i  |  i;2] 
from  the  right,  and  substituting  (C.l )  and  (C.2), 


0-^2121^2121  —R22ll 


0  —  #2121^2122  +  +2221^02222  “  R2 212* 


(C.3b) 


“  ^02222  —  F o2222 A 2222  +  +2222^02222  “  f?02222 02222^02222 
—  R2222  +  H2X22&Il2l22t  (C.3c) 


+2221  +2222 


A22[vx  v2]  -  T  [VX  V2% 


which  can  be  written  as  (28). 


(C.3a)  Appendix  D,  Proof  of  Theorem  11 


Since  SF2  =0  (Chen  8c  Speyer,  2000),  S  can  be  written  as 

r°  ki 

s  =  iux  u2 1  I  ;  L 


L°  sJL«iJ 

By  multiplying  (30)  by  [ui  u2f  from  the  left  and 
[«i  u2]  from  the  right,  subtracting  [«,  M2]TS[tii  it2] 
and  [m  i  W2]tS[mi  M2]  from  both  sides,  and  using  [u\  u2 ] 
[«i  m2]t  =  I, 


0  0 


0  -S 


■  S\  +  Sj  +  S2  —  So, 
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where 

Si  = 


Ml 

Mi  Mi 


r„T 


Bi(F]Cr  V~ 1  CF2)~ lFjCJV~l  C[«i  u2]  ) ,  (D.2a) 


Si  = 


S0  = 


0  0 
0  f 

Qu  Qn 
Q12  Qn  J 

An  An 

Rj2  R-22 


B,(FjCTV~lCF2rlBj[ul  u2) 


0  0 
0  f 

Ru  Ri2 


(D,2b) 


$12  Ml 


T 
12 

f2(f2‘c1  v~'cf2)~1fJ 

?ii  ^12 


1 

1 

£* 

tWM-j 

C«i  u2] 


LRl  Rn 


(D.2c) 


Since  Im  u\  =  ImF2,  let  u\  ~  F2X  where  1  satisfies 
ITFjF2I  —  I  because  ujui  =  I.  By  using  u\  —  F2E  and 
r-i  =  rTFjF2, 

u]f2(fJctv~1cf2)~1fJu1  =  [I't(f2tctk-1cf2)2:]-1 

=  *»•  (D.3) 

By  using  mJ^2  =  0  and  (D.3), 

'0  0 


50  = 


[0  A22-A|2AnlA12J 


(D.4) 


S2  = 


Since  u  i  =  P2Z  +  F2Z,  «|«  i  =  ulF2Z  because  ujF2  =  0. 
By  using  Bt  =  AF2  -  F2,  ux  =  F2F  and  ujF2Z  =  ujux, 

i4B1Z=A21.  (D.5) 

By  using  (D.3),  «i  =  F2Z,  Z~l  =  ZTFjF2  and  (D.5), 

kJf1(fJctf-1cf2)-1sJh2 =^2,A-,14i-  (D.6) 

By  using  (D.6), 

'0  0 

[0  SiAuR^Ajt  -  Q22)S 
By  using  u\  =  F2r  and  (D.5), 
uJBi(F2CtV~  1 CF2  )-  ‘fJct  F- 1  Cttj  =  A2l . 

By  using  (D.3),  «i  =  F2Z,  Z~l  =  ZTFjF2  and  (D.5), 
«IS1(FjCTF-1CF2)-IFjCTF-,Cu2  =A2lR^RX2.  (D.9) 
By  using  (D.8)  and  (D.9), 

’0  0 

[0  S{A22-A2xRulRn)_ 


(D.7) 


(D.8) 


S\  = 


(D.10) 


By  substituting  (D.IO),  (D.7)  and  (D.4)  into  (D.l), 
—  if  =  S(A22  —  A2\RulRi2)  +  (A22  —  A2\R^  RX2)T  S 


+  SiAuRyi  aJx  —  Q22)S 


-(R22-Rl2RuRn)- 


(D.ll) 


By  comparing  (D.l  1 )  with  (27a),  f  =  77022. 
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Abstract 

Many  fault  detection  filters  have  been  developed  to  detect  and  identify  sensor  and  actuator  faults. 
An  approach  to  further  reconstruct  sensor  and  actuator  faults  from  the  residual  generated  by  the  fault 
detection  filter  is  proposed.  The  transfer  matrix  from  the  faults  to  the  residual  is  derived  in  terms  of 
the  eigenvalues  of  the  fault  detection  filter  associated  with  the  invariant  subspaces  of  the  faults  and  the 
invariant  zeros  of  the  faults.  For  each  fault,  all  possible  fault  reconstruction  processes  are  derived  and 
parameterized  by  applying  a  projector  to  the  transfer  matrix  and  taking  inverse.  Then,  the  optimal  fault 
reconstruction  process  is  determined  by  minimizing  the  ratio  of  the  H2  norm  of  the  projected  transfer 
matrix  from  the  disturbance  to  the  H2  norm  of  the  projected  transfer  matrix  from  the  fault.  For  the 
existence  of  the  fault  reconstruction  process,  the  invariant  zeros  of  the  fault  have  to  be  in  the  left-half 
plane.  Furthermore,  for  reconstructing  a  sensor  fault,  the  system  has  to  be  detectable  with  respect  to 
the  other  sensors. 


1  Introduction 

Any  system  under  automatic  control  demands  a  high  degree  of  reliability  in  order  to  operate  properly.  If 
a  sensor  fails,  the  controller’s  command  will  be  generated  using  incorrect  measurement.  If  an  actuator 
fails,  the  controller’s  command  will  not  be  applied  properly  to  the  system.  Therefore,  one  needs  a  health 
monitoring  system  capable  of  detecting  a  fault  as  it  occurs  and  identifying  the  faulty  component.  This 
process  is  called  fault  detection  and  identification.  One  approach  to  fault  detection  and  identification  is  the 
fault  detection  filter  which  was  first  introduced  by  [1]  and  refined  by  [2],  It  is  also  known  as  Beard- Jones 
detection  filter.  A  geometric  interpretation  and  a  spectral  analysis  of  the  fault  detection  filter  are  given  in 
[3]  and  [4],  respectively.  The  idea  of  the  fault  detection  filter  is  to  place  the  reachable  subspace  of  each  fault 
into  invariant  subspaces  which  do  not  overlap  each  other.  Then,  when  a  nonzero  residual  is  detected,  a  fault 
can  be  announced  and  identified  by  projecting  the  residual  onto  each  of  the  invariant  subspaces.  Design 
algorithms  have  been  developed  to  improve  the  robustness  of  the  fault  detection  filter  [5,  6]. 
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Actuator  faults  Sensor  faufts 


Figure  1:  Fault  detection,  identification  and  reconstruction 

When  these  faults  occur,  the  residual  only  has  a  transient  response  and  becomes  zero  after  a  while  even 
though  the  faults  still  exist,  A  bias  in  a  single  position  sensor  is  one  possible  example.  However,  for  some 
of  these  faults,  the  fault  reconstruction  process  can  still  generate  the  magnitudes  of  the  faults  even  after  the 
residual  becomes  zero. 

In  Section  2,  the  background  of  the  fault  detection  filter  is  given.  In  Section  3,  the  transfer  matrix  from 
the  fault  to  the  residual  is  derived.  In  Section  4,  the  reconstruction  of  sensor  and  actuator  faults  is  discussed. 
In  Section  5,  a  numerical  example  is  given. 

2  Fault  Detection  Filter  Background 

In  this  section,  the  background  of  the  fault  detection  filter  is  given  [1,  3,  4,  9],  This  is  important  because 
the  fault  reconstruction  process  uses  the  residual  generated  by  the  fault  detection  filter  to  generate  the 
magnitudes  of  actuator  and  sensor  faults. 

Consider  a  linear  time-invariant  system, 


x  =  Ax  +  Bu 

(la) 

y-Cx 

(lb) 

where  u  is  the  control  input  and  y  is  the  measurement.  The  tth  actuator  fault  can  be  modeled  as  an  additive 
term  in  the  state  equation  (la)  [1,  4], 

x  =  Ax  4-  Bu  +  Fafia 


where  Fa  is  the  iih  column  of  B  and  /ia  is  an  unknown  and  arbitrary  scalar  function  of  time  that  is  zero 
when  there  is  no  fault.  The  failure  mode  fia  models  the  time-varying  amplitude  of  the  actuator  fault  while 
the  failure  signature  Fa  models  the  directional  characteristics  of  the  actuator  fault.  The  ith  sensor  fault  can 
be  modeled  as  an  additive  term  in  the  measurement  equation  (lb)  [1,  4], 

y  —  CsX  Esfls  (2) 
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The  fault  detection  filter  gain  L  is  chosen  such  that  A-  LC  is  stable  and  there  exists  an  invariant  subspace 
%  for  each  fault  F*.  %  is  called  the  minimal  (£7,  A)-unobservability  subspace  or  the  detection  space  of  F*. 
Assume  that  the  invariant  zeros  of  (£7,  A,  F*)  have  the  same  geometric  and  algebraic  multiplicities.  Then, 
%  can  be  obtained  by 


where  h  is  the  smallest  non-negative  integer  such  that  CA&F4  ^  0.  It  is  assumed  that  Tx*-Tq  are  inde¬ 
pendent.  If  they  are  not  independent,  the  faults  can  only  be  detected,  but  not  identified.  This  condition  is 
called  output  separability.  It  is  also  assumed  that  (£7,  A,  [Fx  *  •  *  Fq  |)  does  not  have  more  invariant  zeros  than 
(£7,  A,  Fx)  •  *  *  (£7,  A,  Fq).  If  it  does,  the  extra  invariant  zeros  will  become  part  of  the  eigenvalues  of  A  -  LC. 
This  condition  is  called  mutual  detectability.  For  more  details,  please  refer  to  [3].  For  the  algorithms  to 
form  the  fault  detection  filter  gain  L,  please  refer  to  [4,  5,  6), 

When  there  is  no  fault,  the  residual  is  zero  after  the  transient  response  due  to  the  initial  condition  error 
because  A  —  LC  is  stable.  When  the  fault  /x*  occurs,  the  residual  becomes  nonzero,  but  only  in  the  direction 
of  C%  because  ImF*  C  %  and  (A  —  LC)%  C  By  using  a  projector 

Hi  —  I  -  Ker  Hi  j^(Ker  Hi)T  Ker  (Ker  H%)T  (10) 

where  Ker  Hi  —  [CTX  •  *  -  C%^\  C%+i  •  *  •  CTq  ],  the  projected  residual  Hit  is  only  sensitive  to  the  fault  /i*, 
but  not  to  the  other  faults  Therefore,  the  fault  detection  filter  can  detect  and  identify  actuator  and 
sensor  faults. 
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After  the  faulty  sensor  or  actuator  has  been  detected  and  identified,  the  system  may  switch  to  an 
identical  redundant  sensor  or  actuator.  If  such  sensor  or  actuator  is  not  available,  the  controller  has  to 
be  reconfigured  based  on  the  remaining  non-faulty  sensors  and  actuators.  However,  if  the  magnitude  of 
the  sensor  fault  can  be  obtained,  the  correct  measurement  can  be  obtained  by  subtracting  the  fault  from 
the  faulty  measurement.  Then,  the  controller  may  continue  to  function  normally  without  the  need  for 
reconfiguration.  This  is  particularly  useful  when  an  intermittent  sensor  fault  occurs.  If  the  magnitude  of  the 
actuator  fault  can  be  obtained,  the  control  input  applied  to  the  system  can  be  obtained  by  adding  the  fault 
to  the  control  command.  Then,  the  condition  of  the  actuator  can  be  diagnosed  and  the  controller  can  be 
reconfigured  such  that  the  faulty  actuator  may  be  compensated.  For  example,  if  a  bias  is  developed  in  the 
actuator,  it  may  be  compensated  by  reconfiguring  the  controller  given  the  size  of  the  bias.  If  the  actuator  is 
stuck  in  certain  position,  it  can  be  diagnosed  because  the  control  input  would  be  a  constant  regardless  of  the 
control  command.  Then,  the  controller  may  be  reconfigured  by  using  the  remaining  non-faulty  actuators  to 
compensate  the  faulty  actuator  allowing  continued  operation  of  the  system.  Therefore,  fault  reconstruction 
increases  the  flexibility  of  the  system’s  reaction  to  sensor  and  actuator  faults. 

In  this  paper,  an  approach  for  reconstructing  sensor  and  actuator  faults  is  presented.  The  fault  recon¬ 
struction  process  generates  the  magnitudes  of  sensor  and  actuator  faults  using  the  residual  generated  by 
the  fault  detection  filter.  The  block  diagram  is  shown  in  Figure  1.  The  transfer  matrix  from  the  faults  to 
the  residual  is  derived  in  terms  of  the  eigenvalues  of  the  fault  detection  filter  associated  with  the  invariant 
subspaces  of  the  faults  and  the  invariant  zeros  of  the  faults.  By  applying  a  projector  to  the  transfer  ma¬ 
trix,  a  projected  residual  that  is  only  sensitive  to  one  fault,  but  not  to  the  other  faults,  is  obtained.  By 
taking  the  inverse  of  the  projected  transfer  matrix,  all  possible  fault  reconstruction  processes  are  derived 
and  parameterized.  Then,  the  optimal  fault  reconstruction  process  is  determined  by  minimizing  the  ratio  of 
the  H2  norm  of  the  transfer  matrix  from  the  disturbance  to  the  projected  residual  over  the  norm  of  the 
transfer  matrix  from  the  fault  to  the  projected  residual.  For  the  existence  of  the  fault  reconstruction  process, 
the  invariant  zeros  of  the  fault  have  to  be  in  the  left-half  plane.  Furthermore,  for  reconstructing  a  sensor 
fault,  the  system  has  to  be  detectable  with  respect  to  the  other  sensors.  Note  that  the  fault  reconstruction 
process  can  also  be  derived  numerically  from  the  state-space  models  of  the  plant  and  fault  detection  filter  by 
using  the  Silverman’s  algorithm  [7,  8].  However,  the  Silverman’s  algorithm  can  produce  only  one  particular 
fault  reconstruction  process  which  is  not  optimal  in  general.  Furthermore,  the  existence  conditions  and  the 
analytical  structure  of  the  fault  reconstruction  process  cannot  be  obtained. 

In  addition  to  be  used  for  fault  reconstruction,  the  transfer  matrix  from  the  faults  to  the  residual  provides 
a  frequency  domain  interpretation  of  the  fault  detection  filter  which  complements  the  geometric  interpreta¬ 
tion  by  [3].  The  transfer  matrix  provides  information  about  the  transient  and  steady-state  responses  of  the 
residual  to  the  faults.  It  is  shown  explicitly  the  types  of  faults  that  the  fault  detection  filter  cannot  detect. 
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where  Es  is  a  column  of  zeros  except  a  one  in  the  $th  position  and  fis  is  an  unknown  and  arbitrary  scalar 
function  of  time  that  is  zero  when  there  is  no  fault.  The  failure  mode  fis  models  the  time- varying  amplitude 
of  the  sensor  fault  while  the  failure  signature  Es  models  the  directional  characteristics  of  the  sensor  fault.  For 
the  purpose  of  fault  detection  filter  design,  an  input  to  the  state  equation  (la)  which  drives  the  measurement 
in  the  same  way  that  fis  does  in  (2)  is  obtained  as  in  [9],  Define  a  new  state  x  =  x  +  fsfts  where  Es  —  Cfs . 
Then,  the  dynamic  equation  of  x  and  (2)  can  be  written  as 


i  —  Ax  +  Bu  + 


r  ,  -  i 

Ms 

[  fs  fs  J 

-Ms 

V  —  Cx 


(3a) 

(3b) 


where  f  s  Af s ,  Hence,  for  fault  detection  filter  design,  the  sensor  fault  can  be  modeled  as  a  two-dimensional 

additive  term  in  the  state  equation  as  in  (3). 

Therefore,  a  linear  time-invariant  system  with  qa  actuator  faults  and  qs  sensor  faults  can  be  modeled  as 

x  =  Ax  +  Bu  4-  ^  ]  Faifj,ai  (4a) 

i- 1 

y  =  Cx  +  ESiij,Si  (4b) 

i=l 

However,  for  fault  detection  filter  design,  the  following  system  is  used. 


x  =  Ax  +  Bu  +  Y/Faina i  +  £  [  fsi  fsi  | 
y  =  Cx 


i- 1 


i—t 


—  Ax  +  Bu  +  ^2  FiVi 


i= 1 


(5a) 

(5b) 


where  q  =  qa  +  qs.  For  i  =  1  •  •  •  qa,  Ft  =  Fai  and  m  =  Mai.  For  *  =  1  ■  ■  •  qs,  Fi+qa  =  [  fsi  fsi ]  and  ft+9a  = 
y>si  ]^- 

Now  the  fault  detection  filter  will  be  introduced  from  the  geometric  point  of  view  [3],  The  design 
algorithms  [4,  5, 6]  are  omitted  because  only  the  geometric  properties  of  the  fault  detection  filter  are  involved 
with  the  derivation  of  the  fault  reconstruction  process.  Assume  (C,  A)  is  observable.  Fault  detection  filter 
is  a  linear  observer  in  the  form  of 


x  =  Ax  +  Bu  -f  L(y  -  Cx)  (6a) 

r  —  y  —  Ox  (0b) 

where  r  is  called  the  residual.  By  using  (5)  and  (6),  the  dynamic  equation  of  the  error  e=-x-x  and  the 
residual  can  be  written  as 

e  —  (A  -  LC)e  + 
t  —  Ce 
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3  Transfer  Matrix  from  the  Fault  to  the  Residual 


In  this  section,  the  transfer  matrix  from  the  fault  to  the  residual  is  derived  in  terms  of  the  eigenvalues 
of  the  fault  detection  filter  associated  with  the  detection  space  of  the  fault  and  the  invariant  zeros  of  the 
fault.  This  gives  a  frequency  domain  interpretation  of  the  fault  detection  filter  which  complements  the 
geometric  interpretation  by  [3],  The  transfer  matrix  provides  information  about  the  transient  and  steady- 
state  responses  of  the  residual  to  the  fault.  It  is  shown  explicitly  the  types  of  faults  that  the  fault  detection 
filter  cannot  detect.  In  Section  3.1,  the  actuator  fault  is  considered.  In  Section  3,2,  the  sensor  fault  is 
considered, 

3.1  Actuator  Fault 

From  (4)  and  (6),  the  transfer  matrix  from  the  actuator  fault  fiai  to  the  residual  r  is 

JW-CW-A  +  I O-F* 

When  (C,  A,  Fai)  has  pai  invariant  zeros  at  zoi,i  •  •  •  zaitPai,  from  (7),  the  detection  space  of  Fai  is 

'Tax  =  Im [  Fai  AFai  ■■■  Ak"Fai  vaiA  vai>2  ■■■  vai<Pai  ] 

where  kai  is  the  smallest  non-negative  integer  such  that  CAk"Fai  ^  0  and  uaiii  ■  •  •  vai,Pai  are  the  invariant 
zero  directions.  Let  Sai  =  dim  Tai  =  kai  +  pai  +  1-  Assume  that  Aai,i  •  •  •  A the  eigenvalues  of  A  -  LC 


associated  with  Taii  are  distinct.  Since  Tai  spans  5ai  eigenvectors  of  A  - 

LC, 

(A  LC)xj  —  XaijXj 

(11) 

where  j  =  1  *  *  *  Sai  and 

«1,1 

#1,2 

'  •  «1  ,s«i 

[  Fai  AFai  *  *  *  A  axFai  1  ■*  *  *  & aitpai  ]  [  ^1  ^2  * '  *  5ai  j 

<*2,1 

#2,2  * 

■  •  “2,5oi 

(12) 

.  a«ai,l 

#<5ai,2  * 

’  ’  aSai,8«i  . 

If  A»*,i  •  •  •  A0i,$ai  are  not  distinct,  (11)  may  be  modified  with  the  generalized  eigenvectors.  For  1  <  k  <  kai, 


(A  -  LC)AkFai  =  (A  -  LC)AkFai  -(A-  LC)LCAk~xFai  =  (A  -  LC)2Ak~lFai  =  ••• 
=  (A  -  LC)k+lFal  =  (A  -  LC)k+1  =  f 

3~  i  i=i 


( A  ~  LC)AkFai  —  (A-  LC)y^ajtk+iXj  =  A  aij&j,k+iXj 


and 


The  resulting  relationship  is 


(13) 


for  j  =  1  *  *  *  and  k  =  1  *  •  *  kai .  For  1  <  /c  <  pai,  by  using  (9), 

«a* 

—  —  ^ai,k^aitk  ~  Fai&aifi  ~  ^^(^ai?fe%\fe4-fegi+l  — 


i=i 


and 


v«  UQt 

-  LC)vai  k  =  (A  -  LC)yg^k+kai+lXj  =  ^ AaiiiaJtfe+*oi+1x,- 

i=i  i=i 

Assume  that  A0fj  ^  ^Gi3ibj  the  resulting  relationship  is 


Vgi%kQj,  1 


(14) 


%ai,k  ^aij 

for  J  =  1  *  •  *  &nd  A:  =  1  *  *  *Pai*  For  the  case  where  A aij  =  zai^%  please  see  Appendix.  By  substituting 
(13)  and  (14)  into  (12) , 


[  F ai  AFai  ***  A,  a%Fai  t/aitl  ***  ^ai^pai  j 


anXi  a  2A&2 


a$aUlX6ai 


i 


Aai,l 

»  \&ai 

Aai,l 

&*i,  1 

Aai,I 

Za*tPai  ^a€fi 

AGi,2  * 

.  \^ai 

Aai,2 

i?oi,pni 

^ai,I 

Zaitpai  2 

A  aitSai  * 

\kai 

Am,Sai 

^ai.l 

Pai>Pai 

^a4,l  Aai,£a£ 

zaitPa,i~^ia.it6ai  . 

(15) 


From  (11), 


{si  A  LC)xj  —  {s  A ^  {si  —  A  -J-  LC)  ^ xj  — 


s  —  X 


>aij 


Then, 


$ai 

C(sI-A  +  LC)  1Fai  =  ^ 7Z^x~.  .  Cxi  ~  C  [  <*1,1*1  012,1X2  •••  aioilia^ai  ] 

,=X  0  L  J 


By  using  (15), 

C(sI-A  +  LC)-1Fai  =  CAk°iFai[  0  ...  0  1 


8  Aa4,l 
1 

s  ^ai,2 


5—Aai)45ai  j 


(16) 


"  1 

Aai}l 

\  fcai 

t'ai.l 

-1 

1 

Aai,l 

Zai,l-“Aaiti 

^ai,pa4  —  Aai}t 

s~  Aai,l 

1 

AGi52 

i^oia 

1 

Aai,2 

^ai,l  Aai,2 

5  — Aai,2 

1 

AGi,^oi 

\ktt% 

^at.l 

1 

AGi,5ai 

zaitl  Aai,iaj 

-  s~^aitSai  . 
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By  using  Cramer’s  rule  in  matrix  theory  [10] 


C(sI-A  +  LCylFw 

1  Aai,l 

1  2 


det 


1  A, 


ait$a 


\kai  —  l 

X 

&ai,l 

^tPai 

Aait  1 

S—Xai,  1 

Zaisl 

Zai,pai 

\kai~  1 

1 

i'ai.l 

VaitPai 

Aai}  2 

S~Xait2 

zai,l  2 

za  itPai  ^o*»2 

\k*i- 1 

1 

Pai.l 

P^Pai 

AaitSai 

$—Xai,6ai 

Za.%,1—  XaitSai 

det 


Aai,l 
Act, 2 


1  A, 


ykai 

&aitl 

Aai,  1 

Zai,l~-Xaitl 

^ai,pai~Aaitl 

\  fcaf 

&aitl 

P**>Pai 

Aaii2 

zaitl  Xai,  2 

Zai,pai 

skai 

1 

zaifl“Aa»,£a4 

Z«»Pat“ “ 

CAkaiF„ 


(17) 


By  using  the  determinant  operations  in  matrix  theory  [10],  the  numerator  of  (17)  becomes 


riA(s — 


det 


WMs  A  aij) 


det 


_  n^i  (s  zaiJ ) 


&  Aai,l 

S  “  A aitS&i 
“Aaisl 

~A  aitSa% 

1  Aat,l 


(s  Aaifi)Aj|2 


zaitpai—&ai,l 


{ c  __  \  "SX&a*-  1  1  ^  ( s  —  ^ai,ini , 

i  z«i,Pai-*  ai,Sai 


11^=1  (s  A  otj) 

Therefore,  from  (IT), 


det 


_ \kai 

Aait  1 


\  kai 

~Aai}Sai 

\kai 

Asi,l 


\ka% 

Aai.Sa 


2  Vaitl(s-Zaitl) 
Zattl — 


2  ^ai,!  (£  £[a£|_l) 

&aitl 


Vai,l 


ai*Pn.i  (S  Zais Pni) 
ZgttPai'“^at,l 


Za*»Pai— ^Mai  J 

"  LEai_ 


Zai>Pai 


1  A  aitSai 

r(s)  nA(^-^d)  clfc. 


(18) 


#*-(«)  n4=l+Pa<+1(s-Aai,,) 

Remark  1.  When  ^(t)  =  J2j~i  ajeZaijt  where  a/s  are  arbitrary  constants,  r(t)  becomes  zero  after 
a  while  [11].  Therefore,  the  fault  detection  filter  cannot  detect  this  type  of  actuator  faults  because  the 
residual  only  has  a  transient  response  when  the  faults  occur.  For  example,  if  (C,  A,  Fai)  has  an  invari¬ 
ant  zero  at  the  origin,  the  fault  detection  filter  cannot  detect  the  actuator  fault  if  it  is  a  bias.  « 


3.2  Sensor  Fault 

From  (4)  and  (6),  the  transfer  matrix  from  the  sensor  fault  ixsi  to  the  residual  r  is 

=  Esi  -  C(sl  -  A  +  LC)~lLEsi  =  C(sl  -  A  +  LC)~l  [{si  -  A  +  LC)fsi  -  LCfsi] 

=  sC(sI  -  A  +  LC)-1/**  -  C{sl  -  A  +  LC)~lfsi  (19) 
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because  Esi  C  f$i  and  f $i  —  AfS{ .  Note  that  [fSi  fsi ]  is  used  for  fault  detection  filter  design.  When 
A%  fSi)  has  psit i  invariant  zeros  at  z$iti  *  *  *  zsi,p8iiu  the  dimension  of  TSf} x,  the  detection  space  of  /si?  is 
Psiti  +  1  because  Cfsi  =  ^  0.  When  (C,  A, ,  fsi)  has  psii2  invariant  zeros  at  zsitPsi  l+1  *  *  *  zsiiPsi  l+Psi  2 ,  the 

dimension  of  tbe  detection  space  of  f S{ ,  is  +  Psi,2  1  where  kS{  is  the  smallest  non-negative  integer 
such  that  CAksifsi  ^  0.  For  the  fault  detection  filter,  TsM©Tsi}2  spans  ksi+psiA  +  Psi}2  +  2  eigenvectors 
of  A  -  TC.  For  the  fault  reconstruction  process,  it  is  assumed  that  Tsiyl  and  %h2  span  psi  x  -f  1  and 
+  Ps2,2  +  1  eigenvectors  of  A  —  LC ,  respectively.  This  can  be  achieved  by  considering  fsi  and  fsl  as  two 
separate  faults  when  designing  the  fault  detection  filter.  It  is  also  assumed  that  (C,AJsi)  and  {C,AJsi) 
are  mutually  detectable. 

By  following  the  same  derivation  in  Section  3.1, 


C(sI-A  +  LC)~lfsi 
C(sI-A  +  LC)~  %fsi 


]1 

UpA1+\s-xsitj) 

rPsi.t+Psi^ 


c/s. 


n?=: 


'Psifl+2 


CAkaiJsi 


where  A,  ••  *  and  ^p3it  1+2  ‘  ‘  *  ^feai+p3i,i+pSi,2+2  the  eigenvalues  of  the  fault  detection  filter  associ¬ 

ated  with  Tsiji  and  TSij2 ,  respectively.  Therefore,  from  (19), 


r(s) 


SUPj=l(S-  zsi,j) 

UPA1+\s-Xaid) 


nPaifl+psi,2 
j~Psif  1+1 


( s  zsij) 


nfcss4*P«i,l+P3t,2+2/  \ 

j=Psi,i+2  \s  ~  Asiyj) 


CAk,ifsi 


(20) 


±1 

An 

0 

Xi 

-  %2 

A21 

0 

Remark  2.  For  certain  type  of  sensor  faults,  the  residual  only  has  a  transient  response  when  the  faults 
occur  and  becomes  zero  after  a  while  even  though  the  faults  still  exist.  Consider  the  following  system 

w  B2 

Cxi  ]  \  C  0  xx 

x2  j  [  0  1  J  x2 

For  the  fault  in  the  sensor  that  measures  x2j  its  fault  directions  are 

E*  = 

Then,  the  transfer  matrix  from  the  sensor  fault  to  the  residual  is 

r(s)  _  sllj=i(s-2s,j) 

n^+1(-v»)  • 

When  the  sensor  fault  is  a  bias,  the  residual  becomes  zero  after  a  while  because  the  transfer  matrix  has 
a  zero  at  the  origin  [11].  Therefore,  the  fault  detection  filter  cannot  detect  this  type  of  sensor  faults 
because  the  residual  only  has  a  transient  response  when  the  faults  occur.  Note  that  the  zero  at  the 


I 

0 

r  _  i 

0 

0 

1 

> 

ir* 

11 

1  0 
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origin  is  not  an  invariant  zero  of  the  fault.  One  possible  example  is  the  bias  in  a  single  position  sen¬ 
sor,  i.e,,  X2  is  the  integral  of  one  of  the  states  From  the  physical  point  of  view,  this  is  consistent 
with  the  fact  that  the  other  states  are  not  affected  by  the  position  and  only  affect  the  derivative  of 
the  position.  Therefore,  they  cannot  be  used  to  detect  the  bias  in  the  position  sensor.  However,  the 
fault  reconstruction  process,  discussed  in  Section  4,  can  still  generate  the  magnitudes  of  the  faults  even 
after  the  residual  becomes  zero.  This  is  demonstrated  by  the  numerical  example  in  Section  5.  m 


4  Fault  Reconstruction 


From  (18)  and  (20),  the  relationship  between  the  residual  and  all  the  actuator  and  sensor  faults  can  be 
expressed  as 


<la 


<s) = E 


ngl(s,  CAkai 


*=i 


+E 


(s_Aa..) 


Mai(s) 


8IIj=l(8  zsi,j)  _  mtZ+l (s  zsi,j)  kai  - 

nrPsi.i+l/_  \  \  T-r^st+Pst,i+Pstt2+2  /  *  %  ^Sl 


(21) 


rT"«i>1+A/,c  _  \  .  \  T-r&si-tPsia-t-Psia-r*  t  _  \  \ 

i=l  Llli=I  As%j)  llj=psiil+2  (S-Asitj) 

In  Section  4,1,  the  reconstruction  of  the  actuator  fault  is  discussed.  In  Section  4.2,  the  reconstruction  of  the 
sensor  fault  is  discussed. 


4.1  Actuator  Fault 

In  order  to  reconstruct  the  actuator  fault  fiai,  a  projected  residual  that  is  only  sensitive  to  but  not  to 

the  other  faults,  is  needed.  Define  a  projector  Hai  that  annihilates  all  the  faults  except  fiai- 

i-i 


Hai  =  I  -  Ker  Hai  [(Ker  Haif  Ker  Fai]  (Ker  Haif 


where  Ker Hai  =  lm[CAk^Fai  CAk*2Fa2  ■■■  CAk'<i-tFa,i-i  CA^+'Faj+i  ■■■  CAk^Fa<ga  Esl  Es2 
•••  Es,q,  CAk‘lfai  CAk,2fs2  ■■■  CAk’’q‘fStgi  ].  Note  that  Ha j  is  the  same  as  the  projector  (10)  used  by 
the  fault  detection  filter.  By  operating  Hai  on  the  residual,  (21)  becomes 

k^(s)  =  HaiCAkai  Faifial(s) 

llj=l  \s~Aaij) 

Therefore,  the  projected  residual  Hair  is  only  sensitive  to  fiai,  but  not  to  the  other  faults.  Let  qai  beam 
by  1  vector  where  m  is  the  number  of  the  measurements.  By  operating  qai  on  Ha*r,  the  actuator  fault  fiai 
can  be  reconstructed  from  the  projected  residual  H&iT  by  using 

1  J-J^+Paf  +  !(s_  Aajj.) 


Fai(s)  = 


qlHaiCAk»*Fai  Il?=i («  "  zaij) 


■  qtiHair{s ) 


(23) 
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if  all  the  invariant  zeros  of  (C,  A,  Fai)  are  in  the  left-half  plane.  Since  CAkai  Fal  £  KerHai,  HaiCAKi  Fai  =£  0 
and  there  exists  qai  such  that  q^H^C  Ak^  Fai  /  0.  For  example,  qai  =  HalC A^F^.  Note  that  the  Silver¬ 
man’s  algorithm  requires  the  left  inverse  of  HaiCAk^Fai  [8].  Also  note  that  (23)  is  not  proper.  In  order 
to  avoid  differentiating  and  amplifying  the  disturbance,  a  (kai  +  l)-dimensional  low-pass  filter  with  poles 
assigned  as  butterworth  configuration  may  be  used  at  the  expense  of  introducing  a  delay  in  reconstructing 
the  actuator  fault. 

Since  qai  is  not  unique,  an  optimization  problem  is  formulated  to  determine  qai  by  considering  the 
disturbance.  Consider  the  system  (4)  with  the  colored  noise  w, 

X  -  Ax  +  Bu  +  BwW  +  ^2  FaiHai 

i=l 

y  =  Cx  +  Dww  +  ESiiJ.Si 

i=  1 

where  w  =  Aw  +  i d  and  w  is  the  white  noise.  Then, 


Q.ai^*Pai  00 Mat  {$)  00^*00 

where  G^s)  =  HaiCAk“Fai  and  GWai(s )  =  Hai[C(sI  -A  +  LC )~\BW  -  LDW )  +  Dw) 

(si  -A)-1.  The  optimal  qai  is  determined  by  minimizing  the  ratio  of  the  W2  norm  of  the  transfer  matrix 
from  ui  to  qJiHair  over  the  W2  norm  of  the  transfer  function  from  yai  to  g£.ffoir. 

min  J  =  min  (24) 

This  minimization  problem  can  be  solved  by  rewriting  the  cost  criterion  as 


J  “  \W ai@w*i  111  l\ iQai&tlai  111  —  Qai&WaiQai  ~  JQai^fiaiQai 


where  7  is  a  Lagrange  multiplier,  GWai  =  £  GWai (jw)Glai (-jw) dw  and  G^ai  =  JL  f^G^jw) 
G^ai(—jw)  dw.  Note  that  GWai  and  <5Mai  can  be  computed  by  using  their  state-space  models  [12].  For  ex¬ 
ample,  G^ai  —  HaiCW^ai  CT Hai  where  WMa.  is  the  controllability  gramian  of  (A  —  LC ,  Fai)  and  GWai  =  Hai 

[  G  Dw  ]WWai  [  C  Dw  ]T Hai  where  WWai  is  the  controllability  gramian  of 
From  the  first-order  necessary  condition, 

QJ 

Qq  Qafiwai  ~  'l(laiGfj,ai  =  0  =>  GWa.qai  =  jG^aiqai  (25) 

Therefore,  the  optimal  qai  is  the  generalized  eigenvector  of  (GWai1  G^ai)  associated  with  the  smallest  gen- 
eralized  eigenvalue  and  the  optimal  J  is  the  smallest  generalized  eigenvalue.  Note  that  the  ranks  of  GWai 
and  G [lai  are  m  —  dim(Ker  Hai)  and  1,  respectively.  To  solve  for  the  generalized  eigenvector,  it  is  more 
numerically  robust  if  the  dimension  of  GWai  and  G^a,  is  reduced  from  m  tom-  dim(Ker  Hai). 
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4.2  Sensor  Fault 


In  order  to  obtain  a  projected  residual  that  is  only  sensitive  to  the  sensor  fault  but  not  to  the  other 
faults,  a  projector  Hsiti  is  defined  as 

i-i 


H$iil  =  I  -  Ker H8iA  [(KerHsi)1)TKer^sitl]  (Ker Hsiflf 


(26) 


where  Ker Hs%%x  =  Im [CAk«lFal  CAk**Fa2  ***  GAk*«*Fatqa  Esl  Es2  •••  Es%qs  CAkslfsl  CAk**fs2  *•* 
CAka*MfSti^i  CAks^%fSyi+i  *  *  *  CAks>^fSiqs  ].  Note  that  Hsiji  is  different  from  the  projector  (10)  used  by 
the  fault  detection  filter  where  E$i  is  not  in  the  null  space  of  the  projector  and  now  it  is.  By  following  the 
same  derivation  in  Section  4.1,  the  sensor  fault  pLSi  can  be  reconstructed  from  the  projected  residual  HSitir 
by  using 


nksi-j-Pi 


+Psi,l+pi 
1+2 


i,  2+2 


( s  ~ 


<Si,iHaitlCAk*fai 


nPsi,l+P«i,a  /  .  \ 

j=Psi,i+l  Vs  Zn>3> 


(27) 


if  all  the  invariant  zeros  of  (C,  A,  fsi)  are  in  the  left-half  plane.  The  optimal  q$i}i  can  be  determined  similarly 
as  in  Section  4.1.  Note  that  (27)  is  not  proper  and  a  (ksi  +  1) -dimensional  low-pass  filter  may  be  used  to 
reduce  the  effect  of  the  disturbance  at  the  expense  of  introducing  a  delay  in  reconstructing  the  sensor  fault. 
There  is  an  alternative  approach  to  reconstruct  the  sensor  fault  Define  a  projector  Hsi>2  as 


Hsifi  =  /  -  Ker Haifl  [(Kerflrsi>2)TKerJffsi>2]  *  (KeriJsi,2)T  (28) 

where  Ker  Haifi  =Im[CAk^Fai  CAk^Fa2---  C  Ak*«°  Fa,qa  EsX  Es2  ■  ■  ■  Es>i_i  ESti+1  ■  ■  ■  ESiq,  CAk^fsl 
CAks2fS2  *  *  •  CAks^fSiqs  ].  Then,  the  sensor  fault  (isi  can  also  be  reconstructed  from  the  projected  residual 
Hsit2r  by  using 


ngiy+1(g-A»,3)  r 


Osi^Esi^Esi  S  n^l1  (s  zsij) 


*UiaHsiar{s) 


if  all  the  invariant  zeros  of  (C,  A ,  fsi)  are  in  the  left-half  plane.  The  optimal  qsit2  can  be  determined 
similarly  as  in  Section  4.1  except  a  finite  frequency  range  of  the  designer’s  choice  would  be  used  instead 
of  the  frequency  range  from  —  oo  to  oc  because  the  transfer  matrix  from  fisi  to  qJi  2Hsi  2r  is  not  strictly 
proper.  Note  that  (29)  is  proper.  The  reconstructed  sensor  fault  generated  by  (29)  may  be  less  sensitive  to 
the  disturbance  than  the  one  generated  by  (27)  because  the  disturbance  is  not  differentiated.  Furthermore, 
since  a  low-pass  filter  is  not  required  for  (29),  there  is  no  delay  in  reconstructing  the  sensor  fault.  However, 
(29)  is  only  stable  in  the  sense  of  Lyapunov,  but  not  asymptotically  stable.  The  effect  of  the  disturbance 
may  accumulate  over  time. 
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Remark  3.  Consider  a  linear  time-invariant  system  that  is  observable  with  all  sensors,  but  unobservable 
without  one  of  the  sensors. 


Xt 

£2 


v  = 


An  0 
^21  ^22 


Xi 


+ 


Bi 

B2 


- 

- 

- 

Cx  i 

c 

0 

Xi 

#2 

0 

1 

22 

u 


This  system  is  unobservable  without  the  sensor  that  measures  x2.  For  the  fault  in  this  sensor,  its  fault 
directions  are 


o 

i 

O 

O 

1 

-  [  fs  fs  j  = 

1  A2  2 

Then,  the  transfer  matrix  from  the  sensor  fault  to  the  residual  is 


>*(g)  _  (s  -  rij=i(g  -  z*j)  „ 
rfii+1(g-AS)J) 

Hence,  the  eigenvalue  associated  with  the  unobservable  mode  will  become  one  of  the  poles  of  the  fault  recon¬ 
struction  process.  Therefore,  for  reconstructing  a  sensor  fault,  the  system  has  to  be  detectable  with  respect  to 
the  other  sensors.  — 


5  Numerical  Example 


Consider  a  linear  time-invariant  system  with 
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A  fault  detection  filter  is  designed  to  detect  and  identify  the  faults  in  the  actuator,  second  sensor  and  fifth 
sensor.  Note  that  the  fifth  sensor  can  be  considered  as  a  position  sensor  because  it  measures  the  sixth  state 
which  is  the  integral  of  the  fifth  state  and  does  not  affect  other  states.  From  (4)  and  (5),  the  fault  directions 
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are 
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By  using  the  design  algorithm  in  [5],  the  fault  detection  filter  gain  is  obtained  as 


-21.4426 

1,0000 

13.2130 

23.9835 

0,0000 

-6.9663 

-0.5000 

-0.1685 

12.1966 

0.0000 

11.2900 

2,0000 

-9.4501 

-15.6101 

-0.0000 

-14.8214 

-1.0000 

13.1070 

26.3927 

0.0000 

16.2212 

-5,0000 

-8.1061 

-27.9910 

-0.0000 

-4.3015 

0,0000 

2.5077 

5.7139 

3.5000 

The  eigenvalue  associated  with  Fa  is  —5,  The  eigenvalues  associated  with  fsi  and  fs i  are  —2.5  and  —3, 
respectively.  The  eigenvalue  associated  with  fs 2  Is  —3.5. 

To  evaluate  the  performance  of  the  fault  detection  filter,  an  actuator  fault  and  two  sensor  faults  are 
imposed  on  the  system  separately.  The  actuator  fault  simulates  a  stuck  actuator.  In  Figure  2,  the  top  left 
figure  shows  the  control  command.  The  middle  left  figure  shows  the  actuator  fault  fia  which  occurs  at  sixth 
second.  The  bottom  left  figure  shows  the  control  input  applied  to  the  system,  which  is  the  sum  of  the  control 
command  and  the  actuator  fault.  It  shows  that  the  actuator  is  stuck  at  1  after  sixth  second  regardless  of  the 
control  command.  The  sensor  faults  simulate  the  bias  developed  in  the  sensors.  In  Figures  3  and  4,  the  top 
left  figures  show  the  second  sensor  fault  fis\  and  the  fifth  sensor  fault  fis2  which  start  at  the  fourth  second 
and  end  at  the  twelfth  second,  respectively.  Figure  5  shows  the  time  response  of  the  norms  of  the  three 
projected  residuals  generated  by  the  fault  detection  filter  (6)  using  projectors  (10)  in  the  presence  of  the 
colored  sensor  noise  where  A  ~  — 1000 J,  the  power  spectral  density  of  w  is  2/,  Bw  =  0  and  Dw  —  L  Each 
row  shows  the  projected  residuals  when  one  of  the  faults  occurs.  Each  column  shows  one  of  the  projected 
residuals  when  the  faults  occur.  Note  that  only  the  projected  residual  associated  with  the  faulty  Instrument 
becomes  large  when  the  fault  occurs.  However,  the  projected  residual  associated  the  fifth  sensor  becomes 
small  after  a  while  even  though  the  fifth  sensor  fault  still  exists.  This  is  consistent  with  the  discussion  in 
Remark  2.  Therefore,  the  fault  detection  filter  can  detect  and  identify  the  actuator  and  second  sensor  faults, 
but  not  the  fifth  sensor  fault. 
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To  reconstruct  these  three  faults,  the  relationship  between  the  residual  and  the  faults  is  obtained  from 

(21). 

r(s)  =  CFafia(s)  +  Es i  -  Cfsl )  M.i  W  +  Es2fis2(s) 

From  (22),  the  projector  Ha  used  for  reconstructing  the  actuator  fault  is  obtained  by  annihilating  [  Esl  Cfsl 
Es2  ].  From  (23),  the  actuator  fault  can  be  reconstructed  from 


M«) : 


s  +  5 


QaHar(s) 


(30) 


qTaHaCFa 

with  the  optimal  qa  =  [0.9114  0  -  0.3903  0.1308  0]  determined  from  (25).  In  Figure  2,  the  top  middle 
figure  shows  the  reconstructed  actuator  fault  generated  by  (30).  To  reduce  the  effect  of  the  sensor  noise,  a 
low-pass  filter  is  added  to  (30). 

20(s  4-  5) 


Ms) 


■qiHar(s) 


(31) 


qlHaCFa{s  +  20) 

The  initial  condition  of  the  fault  reconstruction  process  is  zero  given  that  there  is  no  fault  initially.  The 
middle  center  figure  shows  the  reconstructed  actuator  fault  generated  by  (31)  which  is  close  to  the  actual 
actuator  fault  shown  in  the  middle  left  figure.  By  adding  this  reconstructed  actuator  fault  to  the  control 
command,  the  control  input  applied  to  the  system  is  reconstructed  and  shown  in  the  bottom  middle  figure 
which  is  close  to  the  actual  control  input  shown  in  the  bottom  left  figure.  This  information  can  be  used  to 
evaluate  the  condition  of  the  actuator  and  in  this  case,  the  actuator  found  to  be  stuck.  To  demonstrate  that 
the  reconstructed  actuator  fault  generated  with  an  arbitrarily  chosen  qa  is  more  sensitive  to  the  sensor  noise 
than  the  one  generated  with  the  qa  derived  from  solving  (24),  the  top  right  figure  shows  the  reconstructed 
actuator  fault  generated  by  (30)  with  qa  arbitrarily  chosen  as  [0  0  0  1  0].  To  reduce  the  effect  of  the 
sensor  noise,  a  low-pass  filter  with  a  slower  pole  is  used. 

<32> 

The  middle  right  figure  shows  the  reconstructed  actuator  fault  generated  by  (32)  whose  delay  in  recon¬ 
structing  the  actuator  fault  is  worse  than  (31).  This  becomes  clearer  in  the  bottom  right  figure  where  the 
reconstructed  control  input  is  shown. 

From  (26)  and  (28),  two  projectors  used  for  reconstructing  the  second  sensor  fault  are  obtained  by 
annihilating  [CFa  Esl  Ea2]  and  [CFa  Cfs i  Es2 ],  respectively.  In  Figure  3,  the  middle  left  and  bottom 
left  figures  show  the  reconstructed  second  sensor  faults  generated  by  (27)  with  a  low-pass  filter  and  (29), 
respectively.  Both  are  close  to  the  actual  sensor  fault  shown  in  the  top  left  figure.  However,  the  reconstructed 
sensor  fault  generated  by  (27)  is  more  sensitive  to  the  sensor  noise  and  has  a  delay  due  to  the  low-pass  filter. 
By  subtracting  the  reconstructed  sensor  faults  from  the  faulty  measurement,  the  second  measurements  are 
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Costroi  command 


Reconstructed  actuator  fault  Reconstructed  actuator  fault 


Figure  2:  Fault  reconstruction  for  the  actuator 


reconstructed  and  shown  in  the  middle  and  bottom  right  figures  which  are  close  to  the  correct  measurement 
shown  in  the  top  right  figure.  In  the  middle  right  figure,  the  spikes  at  fourth  and  twelfth  second  are  due  to 
the  delay  in  reconstructing  the  sensor  fault. 

For  reconstructing  the  fifth  sensor  fault,  the  projector  can  only  be  obtained  from  (28)  by  annihilating 
[CFa  Es\  Cf8i ].  In  Figure  4,  the  bottom  left  figure  shows  the  reconstructed  sensor  fault  generated  by 
(29)  which  is  close  to  the  actual  sensor  fault  shown  in  the  top  left  figure.  The  bottom  right  figure  shows  the 
reconstructed  fifth  measurement  which  is  close  to  the  correct  measurement  shown  in  the  top  right  figure. 
Note  that  the  fault  reconstruction  process  can  still  generate  the  magnitude  of  the  fifth  sensor  fault  even 
after  the  projected  residual  becomes  zero  as  shown  in  the  bottom  right  figure  of  Figure  5, 

6  Conclusion 

The  fault  reconstruction  process  generates  the  magnitudes  of  sensor  and  actuator  faults  using  the  residual 
generated  by  the  fault  detection  filter.  An  optimal  fault  reconstruction  process  is  derived  from  solving  a 
minimization  problem  by  considering  the  disturbance.  For  the  existence  of  the  fault  reconstruction  process, 
the  invariant  zeros  of  the  fault  have  to  be  in  the  left-half  plane.  Furthermore,  for  reconstructing  a  sensor 
fault,  the  system  has  to  be  detectable  with  respect  to  the  other  sensors.  Although  the  fault  reconstruction 
process  can  also  be  derived  numerically  by  using  the  Silverman’s  algorithm,  it  is  not  optimal  in  general  and 
its  existence  conditions  and  analytical  structure  cannot  be  obtained. 
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Actuator  residual  2nd  sensor  residual  5th  sensor  residual 


Time  ,  sec  Time  ,  sec  Time  ,  see 


Figure  5:  Projected  residuals  generated  by  the  fault  detection  filter 


Appendix 


If  the  first  eigenvalue  of  A  —  LC  associated  with  Ta%  is  at  the  first  invariant  zero  of  (C,  A,  Fai),  i.e., 
Xa% i  =  u  (14)  is  still  true  except  when  j  =  k  —  1,  (14)  becomes  Paitia  1,1  =  0*  If  &ai, l  =  0?  (IS)  and 
(16)  become 


Fai  AFai  •••  AkaiFai  vaiii 


al,l  «l,l^oi,l 
1  ^oi,  2 


al,fc„i+2 


Vaijl 


=  [  &2tlZ2 

&lfl  &ait2 

fi»,2 

Zai,2—&ai,2 


■  a$*iAXSai 

fgjffai 

Aq«,1 

_ Poi»PflT 

Aa£,2 


1  X  -  f  ...  _  _ &ai,2 

at'Sai  A*iA*i  Zaii-Xai*.  Zaif  2-Aa 


_ -±i-£ai _ 


■ 

ai,i  1 

1 

3^1  <*2,1^2  * 

S  Aai,l  S  Aai,2 

S-Aa  i,Sai  , 

Then,  by  following  the  same  derivation  in  Section  3.1,  (18)  can  be  derived  with  a  pole-zero  cancellation.  If 
a i}i  =  0,  (15)  and  (16)  become 


Fai  AFai  ”*  AkaiFm 


&ai.l  *  *  * 


fcoi+2^1  02,1212  •••  OcsaillXsai 


0  0 
1  ^01,2 


1 

0 

0 

&ai,  2 

^aifPni 

Zaitl  Aai,2 

Za«,2—  Aai,2 

Zaitpaj  Aai  ,2 

&ai,l 

&ait2 

Pai>Pai 

Zaifl—^aitSai 

^ai  ,2  Aat,da^ 

Zai,Pai~*ai>*ai 
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and 


C(sl -A  +  LC)  1Fai-C[  aitkai+2xi  a2, ix2  aSaUixSai  ]  [  0  a_^a  •••  — j 

Then,  by  following  the  same  derivation  in  Section  3.1,  (18)  can  also  be  derived  with  a  pole-zero  cancellation. 

If  ^ai,i  =  a iti  =  0,  the  derivation  is  similar.  Note  that  the  two  derivations  above  can  be  extended  to  the 

case  where  multiple  eigenvalues  of  A  -  LC  associated  with  Tai  are  at  the  invariant  zeros  of  (C,  A,  Fai). 
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SUMMARY 

A  new  robust  multiple-fault  detection  and  identification  algorithm  is  determined.  Different  from  other 
algorithms  which  explicitly  force  the  geometric  structure  by  using  eigenstructure  assignment  or  geometric 
theory,  this  algorithm  is  derived  from  solving  an  optimization  problem.  The  output  error  is  divided  into 
several  subspaces.  For  each  subspace,  the  transmission  from  one  fault,  denoted  the  associated  target  fault, 
is  maximized  while  the  transmission  from  other  faults,  denoted  the  associated  nuisance  fault,  is  minimized. 
Therefore,  each  projected  residual  of  the  robust  multiple-fault  detection  filter  is  affected  primarily  by  one 
fault  and  minimally  by  other  faults.  The  transmission  from  process  and  sensor  noises  is  also  minimized  so 
that  the  filter  is  robust  with  respect  to  these  disturbances.  It  is  shown  that,  in  the  limit  where  the  weighting 
on  each  associated  nuisance  fault  transmission  goes  to  infinity,  the  filter  recovers  the  geometric  structure  of 
the  restricted  diagonal  detection  filter  of  which  the  Beard- Jones  detection  filter  and  unknown  input 
observer  are  special  cases.  Filter  designs  can  be  obtained  for  both  time-invariant  and  time-varying  systems. 
Copyright  ©  2002  John  Wiley  &  Sons,  Ltd. 

key  words:  fault  detection  and  identification;  analytical  redundancy;  Beard-Jones  detection  filter; 
approximate  fault  detection  filter;  robust  fault  detection  filter;  time-varying  system 


1.  INTRODUCTION 

Any  system  under  automatic  control  demands  a  high  degree  of  reliability  in  order  to  operate 
properly.  This  requires  a  health  monitoring  system  capable  of  detecting  any  plant,  actuator 
and  sensor  faults  as  they  occur  and  identifying  the  faulty  components.  This  process  is  called 
fault  detection  and  identification.  The  most  common  approach  to  fault  detection  and 
identification  is  hardware  redundancy  which  is  the  direct  comparison  of  the  outputs  from 
identical  components.  It  requires  very  little  computation.  However,  hardware  redundancy  is 
expensive  and  limited  by  space  and  weight.  An  alternative  is  analytical  redundancy  which  uses 
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the  modelled  dynamic  relationship  between  system  inputs  and  measured  system  outputs  to  form 
a  residual  process.  Nominally,  the  residual  is  non-zero  only  when  a  fault  has  occurred  and  is 
zero  at  other  times.  Therefore,  no  redundant  components  are  needed.  However,  additional 
computation  is  required. 

A  popular  approach  to  analytical  redundancy  is  the  detection  filter  which  was  first  introduced 
by  Beard  [1]  and  refined  by  Jones  [2],  It  is  also  known  as  Beard-Jones  detection  (BJD)  filter.  A 
geometric  interpretation  and  a  spectral  analysis  of  the  BJD  filter  are  given  in  References  [3,4], 
respectively.  The  idea  of  the  BJD  filter  is  to  place  the  reachable  subspace  of  each  fault  into 
invariant  subspaces  which  do  not  overlap  each  other.  Then,  when  a  non-zero  residual  is 
detected,  a  fault  can  be  announced  and  identified  by  projecting  the  residual  onto  each  of  the 
invariant  subspaces.  In  this  way,  multiple  faults  can  be  monitored  in  one  filter.  A  design 
algorithm  [5]  improves  the  robustness  of  the  BJD  filter  by  imposing  the  geometric  structure  to 
isolate  the  faults  and  using  the  design  freedom  remaining  to  bound  the  process  and  sensor  noise 
transmission. 

In  Reference  [3],  a  more  general  form  of  the  detection  filter,  called  restricted  diagonal 
detection  (RDD)  filter,  is  given  of  which  the  BJD  filter  is  a  special  case.  Instead  of 
placing  each  fault  into  an  invariant  subspace  like  the  BJD  filter  does,  the  RDD  filter 
places  all  the  other  faults  associated  with  each  fault  that  needs  to  be  detected  into  the 
unobservable  subspace  of  a  projected  residual.  Therefore,  each  projected  residual  is 
only  sensitive  to  one  fault,  but  not  to  the  other  faults.  When  every  fault  is  detected, 
the  RDD  filter  is  equivalent  to  the  BJD  filter.  However,  some  faults  do  not  need  to  be 
detected,  but  only  need  to  be  blocked  from  the  projected  residuals.  For  example,  certain 
process  noise  and  plant  certainty  may  be  modelled  as  faults.  By  relaxing  the  constraint 
on  detecting  the  faults  that  do  not  need  to  be  detected,  the  RDD  filter  is  more  robust  than  the 
BJD  filter  [6], 

One  related  approach,  unknown  input  observer  [7-9],  is  another  special  case  of  the  RDD  filter 
when  only  one  fault  is  detected.  The  faults  are  divided  into  two  groups:  a  single  target  fault  and 
possibly  several  nuisance  faults.  The  nuisance  faults  are  placed  in  the  unobservable  subspace  of 
the  residual.  Therefore,  the  residual  is  only  sensitive  to  the  target  fault,  but  not  to  the  nuisance 
faults.  Although  only  one  fault  can  be  monitored  in  each  unknown  input  observer,  there  are 
some  benefits.  For  example,  one  gains  additional  flexibility  which  can  be  used  to  improve 
robustness  and  time-varying  systems  can  be  treated  [10-12], 

In  this  paper,  a  new  robust  multiple-fault  detection  and  identification  algorithm  is  derived 
from  solving  an  optimization  problem.  The  output  error  is  divided  into  several  subspaces  by 
using  projectors.  For  each  subspace,  the  projected  output  error  variance  due  to  one  fault, 
denoted  the  associated  target  fault,  is  maximized  and  the  projected  output  error  variance  due  to 
other  faults,  denoted  the  associated  nuisance  fault,  process  noise,  sensor  noise  and  initial 
conditional  error  is  minimized.  The  cost  criterion  is  constructed  such  that  each  projected  output 
error  variance  is  included  as  a  sum  which  produces  approximately  the  geometric  structure  of  the 
RDD  filter.  Therefore,  each  projected  residual  of  the  robust  multiple-fault  detection  filter  is 
affected  primarily  by  one  fault  and  minimally  by  other  faults  and  is  robust  with  respect  to  the 
disturbances.  Note  that  [12],  an  approximate  unknown  input  observer,  is  a  special  case  of  the 
filter  when  only  one  fault  is  detected. 

In  the  limit  where  the  weighting  on  each  projected  output  error  variance  due  to  the  associated 
nuisance  fault  goes  to  infinity,  it  is  shown  that  the  filter  places  each  associated  nuisance  fault 
into  the  unobservable  subspace  of  its  associated  projected  residual  when  there  is  no 
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complementary  subspace*  for  both  time-invariant  and  time-varying  systems.  Therefore,  the 
filter  becomes  equivalent  to  the  RDD  filter  in  the  limit  and  extends  the  RDD  filter  to  the  time- 
varying  case.  Numerical  examples  show  that  the  filter  is  an  approximate  RDD  filter  when  it  is 
not  in  the  limit  even  if  there  exists  the  complementary  subspace.  These  limiting  results  are 
important  in  ensuring  that  both  fault  detection  and  identification  can  occur. 

The  robust  multiple-fault  detection  filter  is  fundamentally  different  from  other  design 
algorithms  for  the  RDD  or  BJD  filter  which  explicitly  force  the  geometric  structure  by  using 
eigenstructure  assignment  [4,6]  or  geometric  theory  [3,5].  Rather,  the  filter  is  derived  from 
solving  an  optimization  problem  and  only  in  the  limit,  is  the  geometric  structure  of  the  RDD 
filter  recovered  and  the  faults  are  completely  isolated.  When  it  is  not  in  the  limit,  the  filter  only 
isolates  the  faults  within  approximate  unobservable  subspaces.  This  new  feature  allows  the  filter 
to  be  potentially  more  robust  because  of  the  additional  design  freedom  which  allows  different 
degrees  of  fault  isolation.  Furthermore,  a  mechanism  that  enhances  the  sensitivity  of  the 
projected  residuals  to  their  associated  target  faults  is  provided.  Finally,  the  filter  can  be  applied 
to  time- varying  systems.  Although  the  filter  has  all  these  advantages,  the  process  of  deriving  the 
filter  gain  requires  the  solution  to  a  two-point  boundary  value  problem  which  includes  a  set  of 
Lyapunov  equations.  However,  the  filter  gain  computation  can  be  done  off-line  so  that  the  filter 
implementation  is  as  straightforward  as  the  RDD  filter. 

The  problem  is  formulated  in  Section  2  and  its  solution  is  derived  in  Section  3.  In  Section  4, 
the  filter  is  determined  in  the  limit  when  there  is  no  complementary  subspace.  In  Section  5,  the 
projectors  used  to  divide  the  output  error  are  derived  from  solving  the  optimization  problem.  In 
Section  6,  numerical  examples  are  given. 


2.  PROBLEM  FORMULATION 

Consider  a  linear,  time-varying,  uniformly  observable  system 

x  =  Ax  +  Buu  +  Bww  (la) 

y  =  Cx  +  v  (lb) 

where  «  is  the  control  input,  y  is  the  measurement,  w  is  the  process  noise  and  v  is  the  sensor 
noise.  Following  the  development  in  References  [1,4,10],  any  plant,  actuator  and  sensor  faults 
can  be  modelled  as  additive  terms  in  the  state  equation  (la).  Therefore,  a  linear  system  with  q 
faults  can  be  modelled  by 

x  =  Ax  +  Buu  +  Bww  +  Fifij  (2a) 

i=i 

y=Cx  +  v  (2b) 

The  fault  magnitudes  fi,  are  unknown  and  arbitrary  functions  of  time  that  are  zero  when  there  is 
no  fault.  The  fault  directions  Ff  are  maps  that  are  a  priori  known.  Assume  F,  are  monic  so  that 
Pi¥=  0  imply  F^t,#  0. 


‘The  union  of  the  invariant  subspace  of  each  fault  fills  the  entire  state  space  leaving  no  remaining  subspace,  the 
complementary  subspace. 
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If  the  first  s  faults  need  to  be  detected  where  s^q,  the  objective  of  the  robust  multiple-fault 
detection  filter  problem  is  to  find  a  filter  gain  L  for  the  linear  observer 

x  =  Ax  +  Buu  +  L(y  -  Cx)  (3) 

and  projectors  which  operate  on  the  residual 

r  =  y-Cx  (4) 

such  that  each  projected  residual  i$r  is  affected  primarily  by  its  associated  target  fault  q,-  and 
minimally  by  its  associated  nuisance  fault  (i,  =  [/q  ■■■  fi fii+l  ■  ■  •  nqf,  process  noise  w,  sensor 
noise  v  and  initial  condition  error  x(t0)  -  x(t0).  This  approximates  the  RDD  filter  problem.  Even 
though  the  last  q-s  faults  are  not  detected,  they  are  blocked  from  the  s  projected  residuals  used 
for  detecting  the  first  s  faults.  By  relaxing  the  constraint  on  detecting  the  last  q-s  faults,  the 
robustness  of  the  filter  is  improved  [6],  When  s  =  q,  every  fault  is  detected  and  this  approximates 
the  BJD  filter  problem.  When  s-  1,  only  one  fault  is  detected  and  this  approximates  the 
unknown  input  observer  problem. 

By  using  (2)  and  (3),  the  dynamic  equation  of  the  error,  e  =  x-  x,  is 

? 

e  =  (A—  LC)e  +  ^  Fiqi  +  Bww  —  Lv 


Then,  the  error  can  be  written  as 


subject  to 


e(t)  =  0(t, to)e(to)  +  £  <&(t, x) Ftfi/  +  Bww  —  Lv^j  dt 


-  €>(?,  t0)  =  (A-  LQQ(t,  t0),  ®(t0,  t0)  =  I 


The  residual  (4)  can  be  written  as 


r—  Ce  +  v 


To  formulate  the  robust  multiple-fault  detection  filter  problem,  it  is  assumed  that  nx  ■  ■  •  ft  ,  w 
and  v  are  zero  mean,  white  Gaussian  noise  with  power  spectral  density  of  Q\  ■  ■  •  Qq,  Qw  and  V, 
respectively,  and  the  initial  state  .x(fo)  is  a  random  vector  with  variance  of  Pq.  It  is  also  assumed 
that  and  v  are  uncorrelated  with  each  other  and  with x(t0).  Now  a  cost  criterion  is 

needed  for  deriving  L  and  H\  •  •  •  Hs.  If  the  cost  criterion  is  associated  with  the  projected  residual 
Ht(Ce  +  v),  it  is  unusable  from  the  statistical  viewpoint  since  the  variance  of  the  projected 
residual  generates  a  ^-function  due  to  the  sensor  noise.  Therefore,  the  cost  criterion  will  be 
associated  with  the  projected  output  error  HCe.  In  order  to  determine  the  cost  criterion,  define 


Me 


0(t,x)Finidx 


hi(f)  =  HiC  f  ^(t,x)Fifiidx 
Jto 

hj(t)  =  HfC  €>(*,  to)e(to)  4-  /  $(4  % )(Bww  —  Lv)  dt 

JtQ 
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where  F,*  =JFj  ***  F_i  F+i  *  *  *  F^].  From  (5),  E[hi(t)hj(t)T]  represents  the  transmission  from 
to  HfCe,  E[hi(t)hi(t)T]  represents  the  transmission  from  fli  to  f{Ce  and  E[hi(t)hi(t)T]  represents  the 
transmission  from  w,  v  and  e(tq)  to  Hfie  where  £[#]  is  the  expectation  operator*  Note  that  the 
power  spectral  density  of  fii  is  Qt  =  diag(gi  *  *  *  Q^\  QM  *  *  *  Qq)  and  e(t0)  is  a  zero  mean  random 
vector  with  variance  of  P0  if  x(%)  =  F[x(/0)]. 

Therefore,  the  robust  multiple-fault  detection  filter  problem  is  to  find  the  filter  gain  L  and 
projectors  H\-Hs  which  minimize  the  cost  criterion 

j  =j^rt -  jf  trjy:  + mmW\  -  WMfi}  |  &  w 

where  i\  is  the  final  time  and  y\"-ys  are  positive  scalars*  Making  yx  ***ys  small  places  large 
weightings  on  reducing  the  associated  nuisance  fault  transmissions.  The  summation  is  used  to 
sum  the  s  projected  output  error  variances  for  detecting  the  s  faults.  The  trace  operator  forms  a 
scalar  cost  criterion  of  the  matrix  output  error  variance*  Note  that  the  power  spectral  densities 
Qi'"Qq  are  considered  as  design  parameters.  Since  no  assumption  is  made  on  the  fault 
magnitudes,  their  white  noise  representation  Is  a  convenience.  For  each  projected  output  error, 
Qi  and  (1  hdQi  represent  the  weightings  on  the  associated  target  and  nuisance  fault 
transmissions,  respectively.  When  Qf  is  larger,  the  transmission  from  is  larger*  This  provides 
a  mechanism  to  enhance  the  sensitivity  of  the  projected  residuals  to  their  associated  target  faults. 
When  (l/yf*)2i  is  larger,  the  transmission  from  fii  is  smaller*  However,  the  power  spectral 
densities  Qw  and  F,  and  the  variance  Po  can  have  physical  values.  When  Qws  V  and  Pq  are  larger, 
the  transmission  from  the  process  noise,  sensor  noise  and  initial  condition  error  is  smaller, 
respectively. 

Since  the  effect  of  the  process  and  sensor  noises  on  the  residual  is  explicitly  minimized,  the 
filter  is  robust  with  respect  to  these  disturbances*  Certain  types  of  model  uncertainties  can  also 
be  modelled  as  additive  noises  [9,13],  Therefore,  the  filter  can  be  made  robust  to  these  model 
uncertainties.  In  Section  4,  it  is  shown  that  the  filter  recovers  the  geometric  structure  of  the 
RDD  filter  in  the  limit  as  yi f  -*  0,  i  =  1  *  *  *  s,  and  the  faults  are  completely  isolated.  When  it  is  not 
in  the  limit,  the  filter  is  an  approximate  RDD  filter  and  only  isolates  the  faults  within 
approximate  unobservable  subspaces.  This  new  feature  allows  the  filter  to  be  potentially  more 
robust  because  of  the  additional  design  freedom  which  allows  different  degrees  of  fault  isolation. 

In  Section  3,  the  robust  multiple-fault  detection  filter  problem  is  first  solved  with  H\-*HS 
defined  a  priori  as  the  projectors  used  by  the  RDD  filter  [3],  i.e. 

Hi :  9  Wf  KevHi  =  C#j,  Ht  =  /  -  C#;*[(C#j)rC#}]"!(C#j)r  (9) 

where  =  {C& \  *  *  *  i  *  *  *  C?Fq].  For  time-invariant  systems,  =  [CA8l>1  fh\ 

CAh&  fit2  *  *  *  CA&i'?i  fiiPi ]  where  fij  is  the  jth  column  of  Fh  5$j  is  the  smallest  non-negative  integer 
such  that  CA*v  fij^O  and  pt  =  dim  F*.  For  time-varying  systems,  the  projector  (9)  is  generalized 
with  C$~i  =  [CbixSt'i  Cbit2tsa  *  *  *  CbitPhsipi]  [10].  The  vectors  biJ}siJfj  =  1  *  *  *  ph  are  found  from  the 
iteration  defined  by  the  Goh  transformation  [14,15], 

bijjc  =  A{i)biJJk- 1  -  bijjc-i  with  bm  =  fu  (10) 

Sjj  is  the  smallest  non-negative  integer  such  that  Cbgj^^O  for  t  e  [t0,  h].  More  details  about  ET i 
and  can  be  found  in  Section  4.1.  In  Section  4.2,  it  is  shown  that  (9)  minimizes  the  cost 
criterion  in  the  limit.  Therefore,  (9)  is  the  optimal  projector  in  the  limit.  In  Section  5,  the  robust 
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multiple-fault  detection  filter  problem  is  solved  with  H\  •  •  •  Hs  derived  from  solving  the 
minimization  problem. 


3.  SOLUTION 


In  this  section,  the  minimization  problem  given  by  (8)  is  solved.  By  using  (7),  the  cost  criterion 
rewritten  as 


J  = 


1 

h  -to 


x  <D(?,  t  f  dr  CrHi+  t0)P0$(t,  tQ)TCTH, 


) 


d  t 


is  to  be  minimized  with  respect  to  L  subject  to  (6).  By  adding  the  zero  term 
tr{£f=i  H,cmt,t)Pi(t)$(t,t)r -  <b(t,  ta)Pi(to)®(t,to)r-flls  d/dT[0(/,T)Pi(T)'h(^T)r]dT} 
CTm  d t  to  J  and  using  (6),  the  minimization  problem  can  be  rewritten  as 


miin  fh  tr{E  [$C  0(i  ~ PiCTV~l)  V(L  - P,CTV-l)T<I>(t,  t)r  dr  CTHi 


do 


subject  to 

W,  —  (A  —  LC)Wi  +  Wi(A  -  LCf  +  (L-  PCTV~1)V(L  -  P(CTV~X)T,  W,(t0)  =  0  (12) 


for  *  =  1  •* -s  where  Wj^ 0  and 

Pi  —APi  +  PiAT  -  PiCTV~xCPi  +  \piQiP]  -  FiQiFj  +  BWQWBTW,  flfe)  =  P0  (13) 

ft 

The  term  1/(0  —  to)  trQT-=1  HjCPiCTRi j)  it  is  dropped  in  (1 1)  because  it  is  fixed  with  respect 
to  L.  However,  it  will  be  brought  back  in  Section  5  when  the  cost  criterion  is  also  minimized  with 
respect  to  H\  •  -  •  Hs,  Note  that  (13)  is  solved  independently  of  L  and  H\  --  Hs. 

The  variational  Hamiltonian  of  the  minimization  problem  is  defined  as 

S 

^=Y1  W&CWt&tit)  +  tr{K&(A  -  LC)Wi  +  Wt(A  -  LCf 
1=1 

+  (L  -  PjCTV~x)V(L  -  PjCTV~x)T]}} 
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where  K,  is  a  continuously  differentiable  matrix  Lagrange  multiplier.  The  first-order  necessary 
conditions  [16]  imply  that  the  optimal  solution  for  L  and  the  dynamics  of  K,  are 

ft  s 

-n=YJ  [—2CWjKi  +  2  V(L*  -  PjCTV~l)TKi\  =  0 

CL  M 

J2  w  +  "0  cTv~x  (14> 


and 

-K,  =  —  =  Kj(A  —  LC)  +  (A—  LCfK,  +  CTH,C,  Kfa)  =  0  (15) 

where  i=  1  •••s.  Therefore,  the  determination  of  the  filter  gain  requires  the  solution  to  a  two- 
point  boundary  value  problem  which  includes  a  set  of  Lyapunov  equations  (12)  and  (15), 
coupled  by  (14).  An  alternative  approach  is  to  solve  (11)  numerically  by  using  the  gradient 
method.  However,  the  global  minimum  cannot  be  guaranteed  because  (11)  may  not  be  convex. 
Note  that  the  filter  gain  computation  can  be  done  off-line  so  that  the  filter  implementation  is  as 
straightforward  as  the  RDD  filter.  Numerical  examples  are  given  in  Section  6. 

For  the  infinite-time  case,  the  minimization  problem  (1 1)  becomes 

lim  min  J  —  min  tr 

t\  -+oo  L  L 


subject  to 

0  =  (A  —  LCjWi  +  Wi(A  -  LCf  +  (L-  PiCTV-l)V(L  -  PiCTV~lf  (17) 

for  i  —  1  where  0  and 

0  =  APi  +  PiAT  -  PjCT  V~ 1  CP,  +  —  FiQjP]  -  FiQiFj  +  BWQWBTW 

y§ 

The  optimal  solution  for  L  can  be  derived  similarly* 

L*  =  K<j  J^UPi  +  Wd  CTV~l  (18) 

satisfying  (17)  and 

0  =  K,(A  -  LC)  +  (A-  LCfK,  +  CrH,C  (1 9) 

For  the  special  case  where  s  =  1  and  /i,  is  detected,  the  minimization  problem  (1 1)  becomes 

min— J—  [ '  tr  \h,C  f  <t(/,r)(Z,  -  P,CrV~l)V(L  -  P,CT  V~lf  <b{t,zf  dr  Cth]  dr 

L  h-k  Jto  L  J, ■, 

The  optimal  solution  for  L  is 

L*  =  PCTV~'  (20) 


Copyright  ©  2002  John  Wiley  &  Sons,  Ltd. 


Int  J ;  Robust  Nonlinear  Control  2002;  12:675-696 


Appendix  I 


“A  Decentralized  Fault  Detection  Filter,” 

Walter  H.  Chung,  Jason  L.  Speyer  and  Robert  H.  Chen, 
ASME  J.  of  Dynamic  Systems ,  Measurement ,  Control,  Vol.  123, 2001. 


Walter  H.  Chung 
Jason  L.  Speyer 
Robert  H.  Chen 


A  Decentralized  Fault  Detection 
Filter1 


Mechanical  and  Aerospace 
Engineering  Department, 
University  of  California,  Los  Angeles, 
48*121  Engr,  IV, 
Los  Angeles,  CA  90095 
e-mail;  speyer@seas.ucla.edu  or 
waIter.h.chung@aero.org  or 
chrobert@talus.seas.uela.edu 


In  this  paper ;  we  introduce  the  decentralized  fault  detection  filter,  a  structure  that  results 
from  merging  decentralized  estimation  theory  with  the  game  theoretic  fault  detection 
filter.  A  decentralized  approach  may  be  the  ideal  way  to  health  monitor  large-scale 
systems ,  since  it  decomposes  the  problem  down  into  (potentially  smaller)  "local**  prob¬ 
lems.  These  local  results  are  then  blended  into  a  "global**  result  that  describes  the  health 
of  the  entire  system.  The  benefits  of  such  an  approach  include  added  fault  tolerance  and 
easy  scalability.  An  example  given  at  the  end  of  the  paper  demonstrates  the  use  of  this 
filter  for  a  platoon  of  cars  proposed  for  advanced  vehicle  control  systems . 

[DOI:  10.1 115/L 1367859] 


1  Introduction 

Observers  are,  in  many  ways,  an  ideal  tool  for  fault  detection 
and  identification  (FDI).  Failures  act  as  unexpected  inputs  into  a 
system  and,  thus,  drive  the  error  residual  of  any  observer  to  non¬ 
zero  values.  With  careful  selection  of  the  observer  gain,  these 
fault-driven  residuals  can  be  made  to  have  persistent  and  distinc¬ 
tive  characteristics.  In  many  cases,  freedom  exists  to  address  other 
design  issues,  such  as  noise  sensitivity  and  parameter  robustness. 
For  these  reasons,  the  application  of  observers  to  the  problem  of 
fault  detection  and  identification  has  long  been  an  active  area  of 
research. 

There  are  two  types  of  observers  used  for  fault  detection  and 
identification.  The  first  is  known  as  the  Beard-Jones  Fault  Detec¬ 
tion  Filter  [1,2],  This  filter  has  a  unique  subspace  structure  in 
which  the  reachable  subspaces  of  the  modeled  faults  are  restricted 
to  lie  within  nonoverlapping  invariant  subspaces  that  can  be  made 
unobservable  to  a  projection  on  the  filter  residual.  Because  of  this, 
simultaneous  detection  and  identification  can  be  achieved.  The 
failure  is  detected  when  the  projection  is  nonzero.  The  failure  is 
identified  by  the  subspace  corresponding  the  nonzero  projection. 

The  second  type  of  FDI  observer  is  known  as  the  unknown 
input  observer.  In  this  observer,  the  set  of  modeled  faults  is  di¬ 
vided  into  two  groups;  the  faults  to  be  detected  and  the  faults  that 
are  to  be  ignored.  The  former  is  made  distinguishable  from  the 
latter  by  constructing  an  output  through  which  the  latter  set  is 
unobservable.  Detection  is  then  achieved  when  this  output  is  non¬ 
zero  and  identification  is  trivial  because  we  are  only  trying  to 
detect  one  set  of  faults  in  the  possible  presence  of  the  other.  The 
unknown  input  observer  is  clearly  less  capable  than  the  Beard- 
Jones  filter,  but  its  relatively  simple  structure  allows  for  easy  ap¬ 
proximation  by  optimization  methods  [3,4]. 

As  both  of  these  approaches  have  become  more  refined,  appli¬ 
cations  have  begun  to  be  seen  in  the  literature  [5,6].  With  the 
advent  of  applications,  however,  new  issues  related  to  implemen¬ 
tation  have  come  to  die  forefront.  In  this  paper,  we  will  look  at 
some  of  the  challenges  inherent  to  detecting  faults  in  large-scale 
systems.  For  such  systems,  a  decentralized  fault  detection  filter 
may  be  the  logical  approach  to  the  problem. 

The  decentralized  fault  detection  filter  is  the  result  of  combin¬ 
ing  the  game  theoretic  fault  detection  filter  developed  by  Chung 
and  Speyer  [4]  with  the  decentralized  filtering  algorithm  intro¬ 
duced  by  Speyer  [7]  and  extended  by  Willsky  et  al.  [8].  It  ap¬ 
proximates  the  actions  of  an  unknown  input  observer  and  is 
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formed  by  combining  the  estimates  of  several  “local”  estimators 
(each  driven  by  independent  measurement  sets).  For  large-scale 
systems,  it  simplifies  the  health  montoring  problem  by  decompos¬ 
ing  it  down  into  a  collection  of  smaller  problems.  For  some  sys¬ 
tems  like  a  platoon  of  cars  or  a  formation  of  airplanes,  its  decen¬ 
tralized  structure  reflects  the  actual  physical  structure  of  the 
system,  A  decentralized  fault  detection  filter  also  introduces  seal- 
ability  for  circumstances  such  as  when  a  car  joins  the  platoon  or 
when  an  airplane  drops  out  of  formation  for  repairs.  It  also  has 
built  in  fault  tolerance  in  that  sensors  can  be  checked  and  vali¬ 
dated  prior  to  their  measurements  being  blended  into  the  global 
estimate  [9], 

The  remainder  of  the  paper  is  organized  as  follows.  In  Section 
2,  the  decentralized  estimator  is  described.  An  essential  insight 
revealed  there  is  that  observers  that  take  their  gains  from  Riccati 
solutions  are  much  more  suited  for  decentralized  estimation  than 
general  Luenberger  Observers  that  do  not.  This  leads  us  to  a  de¬ 
centralized  fault  detection  filter  based  upon  approximate  unknown 
input  observers.  We  describe  these  observers  in  Section  3,  An 
overview  of  the  decentralized  fault  detection  filter  is  then  given  in 
Section  4.  An  essential  part  of  this  filter  is  how  one  obtains  the 
global/local  decomposition  needed  to  develop  the  network.  We 
suggest  a  technique  based  upon  minimal  realizations  and  demon¬ 
strate  this  in  Section  5  in  an  example  problem  based  around  a  two 
car  platoon. 

2  Decentralized  Estimation  Theory  and  its  Applica¬ 
tion  to  FDI 

2.1  The  General  Solution.  In  this  section,  we  will  review 
the  basic  results  of  decentralized  estimation  theory,  A  detailed 
examination  of  this  theory  is  given  in  [10],  We  begin  with  a  linear 
system  driven  by  process  disturbances,  w,  and  sensor  noise,  v; 

x=Ax+Bwt  *(0),xe7£n,  (1) 

y=Cx  +  v,  ye7£m.  (2) 

It  is  desired  to  derive  an  estimate  of  x.  The  standard  approach  is  a 
full-order  observer, 

x=Ax+L(y  —  Cx),  x(0)=0,  (3) 

which  we  will  refer  to  as  a  centralized  estimator .  An  alternative  to 
this  method  is  to  derive  the  estimate  with  a  decentralized  estima¬ 
tor  in  which  x  is  found  by  combining  estimates  based  upon  “lo¬ 
cal”  models, 

xJ=AJxJ  +  BJwJ\  x-i  e  7ZnJ,  (j—1  ...  A),  (4) 

yJ=  E'VhV,  yi  e  nm> ,  (J=l  ...N).  (5) 

Together  these  local  models  provide  an  alternate  representation  of 
the  original  system,  which  is  referred  to  as  the  “global”  system 
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for  purposes  of  clarification.  The  vector*  x,  is  likewise  called  the 
global  state.  The  number  of  local  systems,  N,  is  bounded  above  by 
the  number  of  measurements  in  die  system,  i,e,,  N^m. 

The  global/local  decomposition  is  really  of  only  secondary  im¬ 
portance.  As  Chung  [10]  argues,  there  are  no  real  restrictions  on 
how  one  forms  the  global  and  local  models.  The  real  key  to  die 
decentralized  estimation  algorithm  is  the  relationship  between  the 
global  set  of  measurements,  y,  and  the  N  local  sets,  yK  The  two 
basic  assumptions  are  that  the  local  sets  are  simply  segments  of 
the  global  set. 


and  that  the  local  sets  can  be  described  in  terms  of  both  the  local 
state  and  the  global  state.  In  other  words,  y;  can  be  given  by  (5)  or 

by 

yJ=  Ch+vK  (J—  1  , , ,  N).  (7) 

Equations  (2),  (6),  and  (7)  imply  that 

C1' 

C=  : 
mCNm 

and  that 

V 

i/=  :  .  (8) 

[vN\ 

The  decentralized  estimation  algorithm  falls  out  when  we  at¬ 
tempt  to  estimate  the  global  state  by  first  generating  estimates  of 
the  local  systems  (4)  using  the  local  measurement  sets,  yJ\  and  the 
local  models,  Aj: 

&=AW+IJ(yj-EW)t  £J(to)  =  0,  (J—  1  , . .  N).  (9) 

The  objective  then  is  to  obtain  the  global  state  estimate,  x} 
through  some  simple  function  of  die  local  estimates.  As  it  turns 
out,  in  the  most  general  case,  the  global  estimate  is  an  affine 
combination, 

N 

i=2  (Gt^+ti),  (10) 

7=1 

where  &  is  a  measurement-dependent  variable  propagated  by 

y = -  &-  G*&)  & ,  hJ(0) = 0,  (1 1) 

The  constituent  matrices  are  defined  as 

N 

4>.=A-2  Gjucj, 

7=1 

The  matrices  are  “blending  matrices,”  They  are  so-called  be¬ 
cause  they  act  to  blend  the  local  estimates  together  to  form  the 
global  estimate.  They  can  also  be  shown  [10]  to  directly  connect 
the  local  and  global  gains  via 

'  Ll  0  ...  0* 

0  L2  ...  0 

L=[G1 . . .  Gn]  •  .  .  .  (12) 

.0  0  ...  L* 

The  interested  reader  can  derive  Eqs,  (11)  and  (12)  by  differenti¬ 
ating  (10)  and  substituting  in  the  equations  for  the  local  estimators 


where  appropriate.  The  derivation  is  completed  through  some  al¬ 
gebraic  manipulation  and  integration  by  parts  (see  [10]  or  [1 1]). 

Equation  (12)  looks  harmless,  but  it  turns  out  to  be  the  key 
relationship  in  decentralized  estimation.  In  fact,  it  is  the  necessary 
and  sufficient  condition  for  decentralized  estimation  [10,11],  An¬ 
other  interesting  fact  is  that  (12)  does  not  have  a  solution  in  the 
general  case  for  the  blending  matrices,  G;,  because  of  an  insuffi¬ 
cient  number  of  equations  for  all  of  the  unknowns.  There  is,  how¬ 
ever,  one  general  class  of  estimator  for  which  (12)  is  satisfied 
almost  automatically.  This  class  is  comprised  of  estimators  that 
take  their  gains  from  Riccati  solutions,  i,e.,  Kalman  filters  [7,8]  or 
H°°  filters  [12].  In  this  case,  the  local  gains  are  found  from 

U=Pi(E*)T(Vl)-\  (13) 

where,  in  the  case  of  the  Kalman  filter,  the  matrix  P]  is  the  solu¬ 
tion  of  the  Riccati  equation: 

Pj=AjPj+pi(AJ)T+  BJWJ(B*)r-  Pj(Ejf(VJ)-lEjPJt 

pm=pi* 

The  matrices,  V7  and  Wj,  are  weightings  that  are  related  to  the 
local  disturbances,  and  wJ\  that  drive  the  local  systems  (4),  (5), 
For  the  Kalman  filter,  it  is  assumed  that  and  are  white, 

E[ wJ(t) wj( r)T]  =  WJS(t  ~  t) 

E[vj(t)v^r)T]=V^S(t-r)t 

and  Gaussian.  The  initial  condition,  P| ,  is  chosen  by  the  analyst 
based  upon  his  knowledge  of  the  system.  In  the  global  system,  file 
gain  is 

L=PCtV~\ 

where 

V1  0  —  0 

0  V2  0  0 

:  0  •*.  • 

0  —  —  VN 

The  matrices,  V7,  on  the  block  diagonal  of  V  are  the  local  mea¬ 
surement  noise  weightings.  In  our  example,  however,  we  will 
show  that  there  is  some  design  flexibility  in  choosing  the  global 
weight.  Specifically,  one  can  choose  scalar  gains  on  the  local 
weightings, 

0  —  0  " 

0  a2 V 1  0  0 

•:  o  :  •  (15) 

0  . 

This  added  flexibility  allows  us  to  meet  other  design  criteria  that 
might  arise  in  the  problem.  In  our  example  in  Section  5,  we  dem¬ 
onstrate  how  to  use  this  design  freedom  to  improve  the  response 
of  our  decentralized  fault  detection  filter  to  the  faults  that  we  want 
to  see. 

The  matrix,  P,  is  file  solution  to  the  global  Riccati  equation, 

P=AP  +  PAT+BWBT-PCTVCP,  P(0)=P0. 

The  blending  matrix  solution  is  then 

(P=P(SJf(ajpirl  j=l,...,N,  (16) 

where  SJ  is  any  matrix  such  that 

a=EjSP  (17) 

One  can,  in  fact,  always  take  S*={E*)*C*  where  (£J)+  is  the 
pseudo-inverse  of  £7  [8].  Note  that  the  solutions  for  GJ  will  al¬ 
ways  exist  for  Riccati-based  observers  so  long  as  P;  is  invertible 
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or,  equivalently,  positive-definite.  This  will  always  be  the  case  if 
the  triples,  (C;,A^2F),  are  controllable  and  observable  for  each  of 
the  local  systems. 

2.2  Implications  for  Detection  Filters*  The  analysis  of  the 
previous  section  implies  that  we  will  be  able  to  form  a  decentral¬ 
ized  fault  detection  filter  in  the  general  case  only  if  we  are  able  to 
find  a  Riccati-based  observer  that  is  equivalent  to  a  Beard- Jones 
filter  or  unknown  input  observer.  The  most  direct  way  to  achieve 
this  is  to  find  a  linear-quadratic  optimization  problem  that  is 
equivalent  to  the  fault  detection  and  identification  problem.  This  is 
an  analog  of  the  famous  inverse  optimal  control  problem  first 
posed  by  Kalman  [13],  In  [4],  however,  it  is  shown  that  the  Beard- 
Jones  filter  gains  do  not  correspond  with  those  derived  from 
linear-quadratic  problems.  Aft  indirect  way  to  get  a  Riccati-based 
observer  is  to  pose  a  linear-quadratic  optimization  problem  that 
closely  mimics  the  fault  detection  problem.  Such  a  problem  was 
posed  and  solved  in  [4],  and  we  will  review  the  solution  found 
there  in  the  next  section. 

3  The  Approximate  Fault  Detection  and  Identification 
Problem 

3*1  Problem  Formulation*  Consider  the  system  given  by 
(1),  (2)  with  the  further  assumption  that  the  state  matrices  have 
sufficient  smoothness  to  guarantee  the  existence  of  derivatives 
various  order.  Beard  [14]  showed  that  failures  in  the  sensors  and 
actuators,  and  unexpected  changes  in  the  plant  dynamics  can  be 
modeled  as  additive  signals, 

x=Ax+Bw+Fi/tl+«**  +  F?^.  (18) 

Let  n  be  the  dimension  of  the  state-space.  The  n  X  pt  matrix,  F i , 
i=l-*q,  is  called  a  failure  map  and  represents  the  directional 
characteristics  of  the  ith  fault.  The  pfX  1  vector,  m* ,  is  the  failure 
signal  and  represents  the  time  dependence  of  the  failure.  It  will 
always  be  assumed  that  each  Ft  is  monic,  i.e,,  F^^O  for  Mi 
#  0.  See  [15,4]  for  further  details  on  how  to  model  failures. 
Throughout  this  paper,  we  will  refer  to  Mi  as  the  “target  fault” 
and  the  other  faults,  j-V-q,  as  the  “nuisance  faults.” 
Without  loss  of  generality,  we  can  represent  the  entire  set  of  nui¬ 
sance  faults  (and,  if  desired,  the  disturbance,  w)  with  a  single  map, 
F2 ,  and  vector,  : 

x = Ax + Fj  fjL  i  +  F2Mz . 

Suppose  that  it  is  desired  to  detect  the  occurrence  of  the  failure, 
Mi*  in  spite  of  the  measurement  noise,  v,  and  the  possible  pres¬ 
ence  of  the  nuisance  faults,  jx2 .  The  Beard- Jones  filter  solves  this 
problem  by  finding  a  gain,  L,  so  that  a  standard  Luenberger  Ob¬ 
server, 

x=Ax+L(y-Cx)t  (19) 

will  have  an  invariant  subspace  structure  that  restricts  the  influ¬ 
ence  of  Mi  and  M2  *°  separate  and  nonintersecting  invariant  sub¬ 
spaces.  With  a  properly  chosen  projector,  if,  we  can  then  project 
the  filter  residual,  (y  —  Cx) ,  onto  the  orthogonal  complement  of 
the  invariant  subspace  containing  M2  and  get  a  signal, 

z=H(y-Cx),  (20) 

such  that 

z-  0  when  Mi=0  (M2  is  arbitrary).  (21) 

To  be  useful  for  FDI,  z  must  also  be  such  that 

z=£ 0  when  Mi^O*  (22) 

If  we  restrict  ourselves  to  time-invariant  systems,  (22)  will  be 
equivalent  to  requiring  that  the  transfer  function  matrix  between 
Mi(0  and  z(t)  to  be  left-invertible.  Left-invertibility,  however,  is 
a  severe  restriction,  and  it  has  no  analog  for  the  general  time- 
varying  systems  that  we  want  to  consider  here.  Previous  research¬ 


ers  [15,16]  have,  in  fact,  only  required  that  the  mapping  from 
Mi(0  to  z(t)  be  input  observable,  i,e„  0  for  any  Mi  that  is  a 
step  input.  It  can  be  argued  [16]  that  z  will  be  nonzero  for  “almost 
any”  Mi  >  since  Mi  is  unlikely  to  remain  in  the  kemal  of  the 
mapping  to  z  for  all  time. 

We  formulate  the  approximate  detection  filter  design  problem 
by  requiring  input  observability  and  relaxing  the  requirement  (21). 
Instead  of  (21),  we  require  only  that  the  transmission  of  the  nui¬ 
sance  fault  be  bounded  above  by  a  preset  level,  y>0: 


Equation  (23)  is  identical  to  the  disturbance  attenuation  problem 
from  robust  control  theory.  We  refer  to  the  solution  to  the  ap¬ 
proximate  detection  filter  problem  as  the  game  theoretic  fault  de¬ 
tection  filter , 

We  complete  our  formulation  of  the  disturbance  attenuation 
problem  for  fault  detection  by  constructing  the  projector,  if,  that 
determines  the  failure  signal,  z.  For  time-invariant  systems,  this 
projector  is  constructed  to  map  the  invariant  subspace  containing 
the  range  of  F2  to  zero  [14,15],  i.e., 

H=I-CF[(CF)tCFY\CF)t,  (24) 

where 

. (25) 

The  vector  ftti- 1  *  *  */?2  *  is  the  ith  column  of  F2 ,  and  the  integer 
Pi  is  the  smallest  natural  number  such  that  CA&f^  0*  With  little 
additional  effort,  this  result  can  be  extended  to  the  time-varying 
case, 

H=I-CP{t)[(CP{t))TCP(t)Y\CP(t))T.  (26) 

The  columns  of  the  matrix, 

£(0=[fc?‘(<) . bPpy)l  (27) 

are  constructed  with  the  Goh  Transformation  [4]: 

*#)=//«,  (28) 

b\(t)=A{t)bf\t)-V-\  (29) 

In  the  time-varying  case,  p{  is  the  smallest  integer  for  which  the 
interaction  above  leads  to  a  vector  bfk(t)  such  that  C(t)bfk(t) 
^0  for  all  #e[#o*fi].  It  will  be  assumed  that  A(f),  €(/),  and 
FiU)  are  such  that  p$  exists.  Since  the  state-space  has  dimension, 
n,  ps  is  such  that  O^p^n—  L 
Remark  L  One  of  the  advantages  to  the  disturbance  attenuation 
approach  to  designing  FDI  Observers  is  that  the  time-varying  case 
can  be  handled  as  easily  as  the  time-invariant  case.  This  is  an 
improvement  over  classic  detection  filter  designs. 

We  are  now  ready  to  discuss  the  conditions  under  which  the 
solution  to  (23)  will  also  generate  an  input  observable  mapping 
from  Mi  to  z.  The  key  requirement  is  that  the  system  be  output 
separable.  That  is,  Fx  and  F2  must  be  linearly  independent  and 
remain  so  when  mapped  to  the  output  space  by  C  and  A,  For 
time-invariant  systems,  the  test  for  output  separability  is 

rank  [CA*>/, . CAap,JPt,CA^fl . CAhtfrf 

=F1+F2-  (30) 

As  in  (25),  /;  is  the  ith  column  of  F2,  and  Pi  is  the  the  smallest 
integer  such  that  CA^fpfi 0.  Similarly,//  is  the  jth  column  of  Fx , 
and  Sj  is  the  smallest  integer  such  that  CAsJfji=0.  The  integer 
sum,  P1+P2,  is  the  total  number  of  columns  in  F1  and  F2, 

For  time-varying  systems,  the  output  separability  test  becomes 

rank  [C(t)Sf(t), . . .  ,C{t)fy),C{t)b^{t) . 
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C(t)b^U)]=Pl+p2,  V»e[»„*,],  (31) 

where  the  vectors,  by  and  bJt  are  found  from  the  iteration  de¬ 
fined  by  (2B)  and  (29),  The  initial  vector,  Fj,  is  set  equal  to  the  jth 
column  of  Fj ,  and  b}  is  initialized  as  the  ith  column  of  F2. 

The  following  proposition,  given  in  [4],  connects  output  sepa¬ 
rability  to  input  observability  and  shows  die  importance  of  the 
monicity  assumption: 

Theorem  2.  Suppose  that  a  given  filter  satisfies  (23)  and  generates 
the  failure  signal  z  given  by  (20),  If  Fj  and  F2  are  output  sepa¬ 
rable  and  Fj  is  monic,  then  the  mapping,  /Xj(f)—^z(f),  is  input 
observable. 

3.2  A  Game  Theoretic  Solution,  We  now  turn  our  atten¬ 
tion  to  the  disturbance  attenuation  problem  implied  by  (23),  We 
begin  by  defining  a  disturbance  attenuation  function, 

(‘'\\HC(x-m2Qdt 

0a/=7 77 - - - •  (32) 

[IMT' + Ml  v-  <]*+ N*o)-4ll!0 

JtQ 

Daf  is  simply  a  ratio  of  the  outputs  over  the  disturbances.  Equa¬ 
tion  (32)  is  patterned  roughly  after  (23).  We  have  added  the  sensor 
noise,  v,  and  the  initial  error,  x(tQ)-JcQ,  to  the  set  of  disturbance 
signals  to  inject  tradeoffs  for  noise  rejection  and  settling  time  into 
the  problem,  M,  V,  Q,  and  P0  are  weighting  matrices.  Note  that 
we  do  not  include  the  target  fault,  /jl1  ,  at  this  stage  of  the  design 
problem,  since  we  are  now  focusing  on  nuisance  blocking.  Our 
only  concern  with  is  that  it  be  visible  at  the  output,  which  is 
what  Theorem  2  guarantees.  The  disturbance  attenuation  problem 
is  to  find  the  estimate,  x,  so  that  for  all  fx2,  v  e  L2[t\ and 
*(!q)  £ 

Daf^y- 

The  positive  real  number,  y,  is  called  die  disturbance  attenuation 
bound,  (CfA)  will  always  be  assumed  to  be  an  observable  pair. 

To  solve  this  problem,  we  convert  (32)  into  a  cost  function, 

j=  f,1[||//c(^-je)||^-r(||M2llJf-.+ll)-Cx||2v-1)]* 

JtQ 

-||*(*o)-*olln0.  (33) 

where  we  have  used  (2)  to  rewrite  the  measurement  noise  term. 
Note  that  we  have  also  rewritten  the  initial  error  weighting,  defin¬ 
ing  n0;=yP0 ,  The  disturbance  attenuation  problem  is  then  solved 
via  the  differential  game, 

min  max  max  maxJ^  0,  (34) 

x  y  V-2  x(i0) 

subject  to 

x=Ax+F2fi2t 

y  =  Cx+u.  (35) 

Those  familiar  with  linear-quadratic  optimization,  will  recognize 
die  solution  of  the  differential  game  [4]  to  be  a  Luenberger  Ob¬ 
server, 

x—Ax+  yII"lCrV~l(y  —  Cx),  i(f0)=x0,  (36) 

whose  gain  is  taken  from  the  solution  to  a  Riccati  equation, 

-it=Arn+iiA+ —  nF^MFfn 

r  2 

+  C\HQH-yV~l)C  n(r0)=n0.  (37) 

In  many  cases,  it  is  desired  to  extend  finite-time  solutions  of  game 
theoretic  problems  to  the  steady-state  condition.  Whenever  it  is 


possible  to  find  such  a  solution,  the  optimal  estimator  will  be 
given  by  (36)  with  U  being  the  solution  of  the  algebraic  Riccati 
equation, 

0  =  Arn  +  IIA  +  -  IIF2MF[n  +  Ct{HQH  -  yV~ 1  )C. 

(38) 

However,  unlike  linear  quadratic  optimal  control  problems,  there 
are  no  conditions  which  guarantee  the  existence  of  a  unique,  non¬ 
negative  definite,  stabilizing  solution  to  the  steady-state  Riccati 
equation,  except  in  the  special  case  where  A  is  asymptotically 
stable  [17], 

4  The  Decentralized  Fault  Detection  Filter 

Given  the  results  of  the  previous  two  sections,  we  now  propose 
a  decentralized  fault  detection  filtering  algorithm.  The  essential 
idea  is  to  implement  the  Riccati-based  game  theoretic  fault  detec¬ 
tion  filter  as  a  decentralized  estimator.  An  overview  of  the  proce¬ 
dure  is  as  follows: 

1  Identify  the  sensors  and  actuators  which  must  be  monitored 
at  the  global  level,  i.e,,  define  the  target  faults  for  the  global  filter. 

2  Identify  the  faults  that  should  be  included  in  the  global  nui¬ 
sance  set.  The  remaining  faults  should  be  monitored  at  the  local 
levels, 

3  Derive  global  and  local  models  for  the  system  including  fail¬ 
ure  maps,  Chung  [4]  contains  a  brief  discussion  about  this  pro¬ 
cess.  We  will  demonstrate  one  method  in  which  the  local  models 
are  derived  from  the  global  model  via  a  minimum  realization. 

4  Design  game  theoretic  fault  detection  filters  for  the  local  and 
global  systems.  Solve  die  corresponding  Riccati  equations  and 
store  the  solutions  for  later  use, 

5  Determine  the  blending  solutions  Gj  from  Eq,  (16). 

6  Propagate  the  local  estimates,  xJ\  and  vectors,  /F,  and  then 
use  the  decentralized  estimation  algorithm  (10)  to  derive  a  global 
estimate,  x. 

1  Determine  the  global  failure  signal  from  (y  -  Cx)  where  y  is 
die  total  measurement  set,  C  is  die  global  measurement  matrix, 
and  x  is  die  global  fault  detection  filter  estimate  just  derived. 

Remark  3 .  Minimum  realizations  leave  only  those  states  that  are 
both  observable  and  controllable.  Our  use  of  minimum  realiza¬ 
tions  in  step  #3  extracts  the  local  models  from  the  global  model  by 
pulling  out  only  those  states  (or  combinations  of  states)  that  are 
observable  through  die  local  measurements,  yJ\  and  driven  by  the 
failures  chosen  to  be  included  in  the  local  model.  Determining  a 
compatible  and  consistent  local/global  decomposition  is  a  key  is¬ 
sue  in  decentralized  estimation  and  control.  The  use  of  minimum 
realizations  that  we  suggest  hem  q>  is  a  logical  and  theoretical 
rigorous  approach  to  this  problem, 

5  Range  Sensor  Fault  Detection  in  a  Platoon  of  Cars 

5.1  Problem  Statement.  We  will  now  examine  the  utility 
of  the  decentralized  approach  to  EDI  by  working  through  an  ex¬ 
ample,  The  problem  that  we  will  look  at  involves  the  detection  of 
failures  within  a  system  of  two  cars  traveling  as  a  platoon  (see 
Fig,  I).  The  cars  are  controlled  to  maintain  a  uniform  speed  and 
constant  separation.  The  platoon  is  the  central  component  of  au¬ 
tomated  highway  schemes  in  which  groups  of  cars  line  up  single 
file  and  travel  as  a  unit.  The  objective  is  to  eliminate  the  backup 
caused  by  the  interaction  of  individual  vehicles  maneuvering 
across  highway  lanes  [18,19],  The  viability  of  the  platooning 
scheme,  however,  will  depend  on  many  factors,  not  the  least  of 
which  are  reliability  and  safety. 

The  FDI  schemes  that  we  have  examined  to  this  point  are  ca¬ 
pable  of  monitoring  individual  cars,  but  may  not  be  ideal  for 
monitoring  elements  that  deal  with  the  interactions  between  cars. 
For  example,  to  maintain  uniform  speed  throughout  the  platoon 
and  to  keep  the  spacing  between  the  cars  constant,  additional  sen- 
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sors  will  be  needed  to  measure  the  relative  speed  and  the  relative 
distance*  or  “range,”  between  the  cars.  In  order  to  detect  a  failure 
in  the  range  sensor  using  analytical  redundancy,  however*  it  is 
necessary  to  have  a  dynamical  relationship  between  the  range  sen¬ 
sor  and  other  sensors  on  the  vehicles.  Range,  however,  involves 
the  dynamics  of  both  of  the  cars  and  so  would  require  a  higher- 
order  model  for  its  detection  filter. 

While  this  is  not  necessarily  prohibitive*  it  does  not  make  use  of 
the  many  different  state  estimates  that  are  already  being  propa¬ 
gated  throughout  the  platoon.  The  sensors  on  each  of  the  cars*  for 
instance*  will  be  monitored  by  detection  filters*  and  it  is  more  than 
likely  that  a  state  estimate  would  also  be  generated  by  the  ve¬ 
hicles’  control  loops.  Given  these  pre-existing  estimates*  it  seems 
logical  to  make  use  of  the  decentralized  estimation  algorithm  to 
carry  out  range  sensor  fault  detection. 

The  presentation  of  the  example  is  as  follows.  In  the  next  sub¬ 
section,  we  present  the  problem  and  the  model  of  a  single  car 
derived  in  [18].  We  then  manipulate  this  model  into  a  two  car 
platoon  model  and  define  the  target  and  nuisance  faults.  Referring 
back  to  the  steps  listed  in  Section  4,  these  are  steps  #1,  #2,  and 
part  of  #3.  In  Section  5.3*  we  complete  step  #3  by  deriving  the 


local  models  from  the  global  one.  In  Section  5.4,  we  design  game 
theoretic  filters  for  the  local  and  global  problems  and  calculate  die 
blending  matrices  (steps  #4  and  #5).  We  also  implement  the  de¬ 
centralized  estimator  equations  (step  #6)  and  monitor  the  gener¬ 
ated  residual  for  indications  of  a  Range  sensor  failure  (step  #7). 

5.2  System  Dynamics  and  Failure  Modeling.  Our  ex¬ 
ample  starts  with  the  car  model  used  in  [18].  In  this  model,  the 
nonlinear,  six  degree-of-freedom  dynamics  of  a  representative  au¬ 
tomobile  are  linearized  about  a  straight*  level  path  at  a  speed  of  25 
meters/s  (roughly  56  miles  per  hour).  The  linearized  equations  are 
found  to  decouple  nicely  into  lateral  and  longitudinal  dynamics, 
much  like  an  airplane.  Moreover,  the  linearized  equations  can  be 
further  reduced  by  eliminating  “fast  modes”  and  actuator  states. 
For  simplicity,  we  will  only  use  the  longitudinal  dynamics  which 
we  represent  as 

x~Alx , 

y=CLx, 

where  the  superscript  “L”  stands  for  “longitudinal.”  The  vehicle 
states  are 


x— 


s 

ma 

Vx 

vz 

Z 

% 

$ 

/ 


engine  air  mass  (kg) 
engine  speed  (rad/s) 
long.  velocity(m/s) 
vertical  velocity  (m/s) 
vertical  position  (m) 
pitch  rate  (rad/s) 
pitch  (rad) 


and  are  propagated  by  the  state  matrix* 


(39) 


J 


-22.56 

-0,11683 

0 

0 

0 

0 

0 

307.03 

-35.412 

397,43 

-238.06 

-2698 

-3753 

-331,14 

0 

0.071298 

-0.81773 

0.59338 

6.7786 

16.807 

1.5162 

0 

-0,0019628 

0.022119 

-3.5646 

-40.421 

-9.0765 

-0.81415 

0 

0 

0 

1 

0 

0 

0 

0 

0 

0 

0 

0 

0 

1 

0 

-0.019628 

0.22118 

-0.61304 

-7.1619 

-39,926 

-3.6293 

The  measurements  are 


engine  air  mass  (kg) 
engine  speed  (rad/s) 
long,  acceleration  (m/s2) 
vertical  acceleration  (m/s2) 
pitch  rate  (rad/s) 

front  symmertric  wheel  speed  (rad/s) 
rear  symmertric  wheel  speed  (rad/s) 


with  the  corresponding  measurement  matrix* 


'l 

0 

0 

0 

0 

0 

0 

0 

1 

0 

0 

0 

0 

0 

0 

0,0713 

-0.8177 

0.5934 

6,7786 

16.8068 

1.5162 

0 

-0.0020 

0.0221 

-3,5646 

-40.4210 

-9.0765 

-0.8141 

0 

0 

0 

0 

0 

0 

1 

0 

0 

7.1220 

-4.5806 

-51.9152 

58.8718 

5.1944 

0 

0.0888 

5.9738 

-3,5782 

-40.5542 

-56.4109 

-4.9773 

(40) 


(41) 


(42) 
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The  rear  and  front  symmetric  wheel  speeds  are  states  that  were 
eliminated  when  the  fast  modes  were  factored  out  of  the  linearized 
system. 

5,3  Global  and  Local  Decomposition.  In  order  to  build  a 
detection  filter  for  the  range  sensor,  we  need  to  use  (39)- (42)  to 
build  state  space  models  for  die  platoon, 


n^A7f+Flfifl^F2fi2t 


y=Cv> 

and  the  two  individual  cars, 

V  “A  V  + £}/»!  + rt/4* 

yl=ElV\ 

fj2=A2rjl+F\ti\+F\n\> 

y1-Elrp'. 

We  will  build  up  our  models  with  the  following  steps: 

1  Using  (39)- (42),  we  will  derive  the  global  state  matrices,  A 
and  C, 

2  Using  the  modelling  techniques  described  in  [15]  and  [4],  we 
wilt  determine  the  failure  maps,  Ft . 

3  We  will  then  obtain  the  local  state  matrices.  A1,  E\  and  Fj , 
from  the  minimum  realization  of  the  triples  (CI,A,F2)  and 
(C2,A,F2). 

The  obvious  way  to  get  the  global  matrices,  A  and  C,  is  to  form 
block  diagonal  composite  matrices  with  AL  and  CL  repeated  on 
the  diagonal,  Le., 


al 

0  ' 

CL 

0 

A'  = 

0- 

.  c'  = 

.  0 

CL 

This,  however,  is  not  sufficient,  since  there  is  no  way  to  describe 
the  range,  i?,  between  the  two  vehicles  with  the  given  states  (39). 
Range  is  the  relative  distance  between  the  cars, 

R=xl-x2, 


where  jc*  is  the  longitudinal  displacement  of  car  L  Displacement, 
however,  is  not  a  state  of  the  vehicle  (39),  We  must,  therefore,  add 
a  range  state  to  the  platoon  dynamics,  using  die  equation. 


R=vl-vl 


The  end  result  is  that  the  platoon  will  be  a  fifteen-state  system. 


f 

itln 


r 

e2 

\  R  / 


engine  air  mass  (kg) — Car#l 
engine  speed  (rad/s) — Car#l 
long,  velocity  (m/s) — Car#l 
vertical  velocity  (m/s) — Car#l 
vertical  position  (m) — Car#l 
pitch  rate  (rad/s) — Car#l 
pitch  (rad) — Car#l 
engine  air  mass  (kg) — Car#2 
engine  speed  (rad/s) — Car#2 
long,  velocity  (m/s) — Car#2 
vertical  velocity  (m/s) — Car#2 
vertical  position(m)-Car#2 
pitch  rate  (rad/s) — Car#2 
pitch  (rad) — Car#2 
Range  (m). 


The  corresponding  state  matrix  is 


A  = 


'  Al  0  0 

0  Al  0 

.£i  -F,  0 

£,=[0010000]. 

The  measurement  matrix  is 


(43) 


(44) 


where  C1  and  C2  can  be  inferred  from  (44),  Finally,  the  local 
measurement  sets  are 


y- 


m 


-\ 


engine  air  mass  (kg) — Car#! 
engine  speed  (rad/s) — Car#l 
long,  acceleration  (m/s2) — Car#! 
i  vertial  acceleration  (m/s2) — Car#l 
1  I  pitch  rate  (rad/s) — Car#l 

front  symmertric  wheel  speed  (rad/s) — Car#l 
rear  symmetric  wheel  speed  (rad/s) — Car#l. 


and 


engine  air  mass  (kg) — Car#2 
engine  speed  (rad/s) — €ar#2 
long,  acceleration  (m/s2) — Car#2 
vertial  acceleration  (m/s2) — Car#2 
pitch  rate  (rad/s) — Cai#2 
front  symmertric  wheel  speed  (rad/s) — Car#2 
rear  symmetric  wheels  peed  (rad/s) — €ar#2 
range  (m). 

Our  ultimate  objective  is  to  design  a  filter  that  will  detect  a 
range  sensor  fault  in  the  presence  of  potential  failures  in  the  other 
sensors.  In  an  actual  health  monitoring  system,  we  would  design 
the  global  filter  to  block  out  all  of  the  nuisance  faults  that  are 
output  separable  from  the  range  sensor  fault  and  then  rely  upon 
the  local  filters  to  monitor  the  remaining  faults.  Given  the  size  of 
our  example,  however,  the  full  analysis  required  to  do  a  detailed 
design  would  clutter  our  presentation.  We  will,  therefore,  limit 
ourselves  to  constructing  only  one  local  filter  per  car  and  will 
choose  simple  nuisance  sets  at  both  the  global  and  local  levels. 

For  this  example,  we  choose  to  monitor  the  front  symmetric 
wheel  speed  sensor  at  the  local  level.  The  nuisance  set  is  then 
chosen  to  be  the  engine  air  mass  sensor  and  the  vertical  acceler¬ 
ometer.  At  the  global  level,  the  range  sensor  has  already  been 
designated  as  the  target  fault.  We,  therefore,  complete  the  problem 
definition  by  choosing  the  engine  speed  sensor  and  longitudinal 
accelerometer  as  the  global  nuisance  set.  There  is  no  particular 
significance  attached  to  any  of  our  choices  for  the  nuisance  and 
target  sets,  aside  from  the  choice  of  the  range  sensor  as  the  global 
target  fault. 

Following  standard  modeling  techniques  [15,4],  we  construct 
the  two  engine  speed  sensor  failure  maps  Fmt  and  Fji.  To  save 

space  we  do  not  list  these  matrices  out  explicitly.  The  interested 
reader  can  refer  to  [1 1],  To  complete  the  problem,  we  also  need  to 
construct  maps  for  the  accelerometer  failures,  F^i  and  F^,  and 

the  range  sensor,  FR ,  For  the  local  filters,  failure  maps  need  to  be 
constructed  for  the  airmass  sensors,  Fm i  and  Fmt,  vertical  accel- 

a  a 

erometers,  F^i  and  F^a,  and  front  wheel  speed  sensors,  F^i  and 
F52.  A  quick  application  of  (30)  will  show  that  all  of  our  failure 
sets  are  output  separable. 
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We  are  now  in  position  to  generate  the  local  state  equations.  Hie  local  dynamics  for  car  #1  come  from  the  minimum  realization  of 
(C!,A,[FmtFtfi]).  The  corresponding  matrices  are 


-0.087694 

0,0038094 

-0,12133 

-0.010701 

3.9941 

42,617 

1.2879 

0,032194 

1.6765 

57.123 

7,2346 

26.27 

-665.78 

496.6 

0.00005 

-0.021736 

-22.56 

0,11478 

-0,0001 

0.00008 

-0.00005 

-0.077512 

7,7689 

-301.66 

-38.647 

-137,16 

3612 

-2816.7 

-0.096212 

-0.073026 

2,498 

0.2312 

0.89067 

- 19.054 

9.0737 

-0.94943 

-0,26102 

-0.20407 

-0,067025 

-0.41229 

-2.4689 

0.16425 

-0.27186 

0.092418 

0.12024 

0.19024 

-0,010912 

-1.302 

-1.434 

0 

0 

1 

0 

0 

0 

0 

-0,0004 

0.18605 

0 

-0.98251 

0.008136 

-0,000665 

0.000392 

0,0043561 

-0,014182 

0 

-0.090334 

-0.2118 

11.266 

-14.31 

0.00015951 

—0.00067636 

0 

-0.0048006 

-4.0642 

-41.318 

-2.4264 

-0.00014266 

-0,97872 

0 

-0,18537 

0.0016064 

0.024547 

-0,084511 

-0,00030256 

0.0016942 

0 

0.0069288 

1.4478 

-34.102 

-71.377 

0.0009564 

-0.0038718 

0 

-0.019192 

2.1041 

-55.207 

42.987 

“0 

-0.12133  ' 

’  7.9031 

-1.6879 

0 

57.1230 

-0.0007 

-0.0213 

1 

-22.5605 

0 

0 

0 

-301.6586 

*  Fl>= 

-0,0048 

-0.0057 

0 

2,4980 

z 

-0,1760 

-0,7911 

0 

-0.2041 

-0.0068 

-7,4136 

_0 

0,12024 

^  -0,0003 

—  2.1388^ 

The  model  for  Car  #2  is  similarly  found  by  obtaining  the  minimum  realization  of  (C2,A,[FW2F*2]).  The  corresponding  matrices  are 


-0.26387 

-0.27372 

0,97419 

-0.040683 

0.28256 

0,2607 

0.042752 

L0237 

- 12,546 

-12.054 

- 1.4539 

-0,79488 

-28.279 

-27.514 

-2,1059 

-3,0468 

195,07 

193.92 

-2,3745 

38,898 

3.8593 

4.598 

0,3571 

0.5193 

-4.0915 

-4.8456 

-0.37617 

-0.5471 

-2654.8 

-2639,1 

32,315 

-529,37 

0 

0 

0 

0 

0 

0 

0 

0 

-0,002510 

0.0001643 

0.000136 

0.03416 

0,004805 

8.129 

6.711 

-0.06539 

-0,19848 

-152,87 

-126,21 

2,7044 

-0,000005 

-21.332 

-18,419 

0.00006 

0.00003 

22.827 

17.824 

-0.00042 

-304.44 

2080.5 

1717.5 

-57,774 

0 

0 

0 

0 

0 

0 

0 

0 

-12.008 

-11,76 

-0,5409 

- 1.6668 

5,9034 

6.9535 

0.5329 

0,7915 

-0,0112 

0,01132 

-0,7316 

-0,6816 

-43.291 

-39,922 

-8.6999 

1.9601 

40,011 

39.775 

-0,4870 

7,9783 

0.69369 

-0.71973 

-0.01465 

-0.007574 

0.9973 

0 

0 

0.0733  ' 

0,0733 

0 

0 

-0,9973 

0.00522 

5.2402 

4.3261 

-0.0711 

-0,00014 

-31,362 

-25.727 

0.00196 

0 

-0,00195 

-0,00161 

0 

0 

-40,162 

-33.156 

0 

0,0065 

-31.356 

-25.886 

-0.08855 

0 

-0,01755 

-0,01449 

0 
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*  0 

0 

■  0 

0  " 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

,  F2  2= 

0 

0 

0,9973 

0.0002 

*  vl 

0 

0 

0 

0 

-5.0327 

-4.9282 

0 

0 

6.0961 

-6,2254 

„  0.0733 

—  307.8575^ 

0 

0 

With  all  of  these  system  matrices  in  place,  we  can  now  form  the  residual  projectors,  H,  needed  generate  the  failure  signal,  In  the 
global  filter,  we  define 


/,=[F01iFtfiFw2Fli2]. 


In  the  local  filters,  we  define 


F'=[F’>;,]  1=1,2 


The  projectors,  H  and  H\  are  then  found  by  applying  (24),  Again,  we  do  not  show  either  of  these  matrices  to  save  space, 

5.4  Decentralized  Fault  Detection  Filter  Design.  We  will  first  design  filters  for  the  local  systems.  As  with  all  Riccati-based 
filters,  file  central  step  in  the  process  is  in  obtaining  a  solution  to  the  appropriate  Riceati  equation.  For  simplicity,  we  will  use  the 
steady-state  version.  Typically,  one  iterates  on  die  design  by  trying  various  combinations  of  weightings  until  a  Riccati  solution  is  found 
that  leads  to  a  filter  that  gives  the  best  tradeoff  between  target  fault  transmission  and  nuisance  fault  attenuation.  For  this  example,  it  was 
found  that  choosing 


M^lOX/4,  V x=Ilt 


Q'=h,  y,=0.18 

as  the  weightings  and  attenuation  bound  leads  to  a  filter  gain, 


Singular  Value  Plot  of  local  Gama  Theoretic  Filter  #1 
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Fig.  4  Platoon  example— failure  signal  res 
lion  filter  #  1 
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0.5139 
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for  car  #1,  The  transmission  properties  of  the  filter  are  depicted  in  Fig,  2.  The  minimum  separation  over  frequency  is  only  about  10  dB, 
but  the  filter  has  particularly  good  separation  in  the  high  frequency  range.  For  car  #2,  the  same  weightings,  adjusted  for  the  different 


dimensions  of  the  car  #2  dynamics, 

M1— 

10X/4, 

fi2= 

V2=diag[l  I  101  I  1  1  1], 

=/8,  y2=0.18, 

leads  to 

•  -0.0001 

-0.0000 

0.0000 

-0,0000 

-0.0000 

-0.0003 

-0.0002 

1,1715  * 

0.0001 

0.0000 

-0,0000 

0.0000 

0.0000 

0.0003 

0,0002 

-1.2154 

-0.0000 

-0.0000 

0,0000 

-0.0000 

-0.0000 

0.0000 

0.0001 

-0,0247 

-0.0000 

0.0001 

0.0000 

—0,0000 

-0,0000 

-0.0001 

-0.0000 

-0.0128 

L2=1Q0QX 

0.0081 

-0,0010 

-0.0000 

0.0009 

0.0000 

0.0020 

0.0012 

-0,0002 

0.0027 

0.0005 

0.0003 

-0.0542 

0,0001 

-0,0106 

-0.0146 

-0.0306 

-0,0034 

0.0003 

0.0005 

0,0182 

-0,0002 

-0,0482 

-0.0299 

-0.0233 

.  0.1650 

-2,2230 

-0.0162 

0,0269 

0.0000 

0.0286 

-0.1752 

0 
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The  transmission  properties  for  this  particular  filter  are  depicted  in 
Fig.  3,  The  reader  should  notice  the  similarities  in  the  level  of 
performance  between  this  filter  and  the  one  designed  for  car  #1. 

Finally,  for  the  global  system,  a  fault  detection  filter  for  range 
sensor  health  monitoring  in  the  platoon  is  found  by  solving  the 
corresponding  Riccati  equation  with  the  weightings: 

rV_1=diag(l, 100, 100, 1,1, 1,1,1,10), 100,1,1,1.1,1).  fi=/u, 

M=1O0XJS,  y=0.18. 

For  the  global  system,  however,  we  are  not  interested  in  finding  a 
gain  for  a  global  filter,  but  in  obtaining  a  global  Riccati  solution, 
II,  for  use  in  determining  the  blending  matrices, 

G'=yrr‘s'!^Wj 

The  connecting  matrices,  S\  are  taken  to  be  the  pseudo-inverses 
of  EK  As  the  dimensions  of  these  matrices  are  quite  large,2  we 
cannot  list  them  in  this  paper. 

Note  that  we  use  our  design  freedom  in  V.  The  reason  for  this 
is  that  if  we  had  not  used  this  freedom  and  chosen 


the  response  of  the  filter  to  the  target  fault  input  would  have  been 
unsatisfactory.  In  Fig,  4,  we  show  this  response,  which  is  con¬ 
structed  by  implementing  our  decentralized  estimator,  (10).  In  this 
figure,  the  time  history  of  the  failure  signal,  z,  is  shown  when  the 
system  is  driven  by  a  fault  in  the  range  sensor  and  a  fault  in  the 
longitudinal  accelerometer.  The  range  sensor  is  the  target  fault 
and  as  the  corresponding  plot  in  Fig.  4  shows,  this  fault  is  seen 
almost  immediately  in  the  residual.  Better  yet,  its  presence  is  seen 
over  a  sustained  period.  Had  we  not  adjusted  the  weightings  in  V, 
the  time  constants  in  our  decentralized  filter  would  have  been  too 
small  resulting  in  a  target  fault  response  that  dies  away  too 
quickly. 

The  reader  should  note  that  die  responses  seen  in  Fig.  4  can  be 
understood  to  be  die  result  of  the  direct  feedthrough  of  the  fault 
into  z  since  a  range  fault  goes  directly  to  the  global  measurement 
vector,  y.  The  longitudinal  accelerometer  is  the  nuisance  fault  and 
we  see  in  the  corresponding  plot  that  this  failure  is  also  fed 
through  to  the  residual  but  at  a  much  smaller  magnitude,  A  rea- 


2n  is  15  X 15  for  instance 


sonably  well-designed  redundancy  management  system  should, 
thus,  be  able  to  detect  the  range  sensor  fault  no  matter  the  behav¬ 
ior  of  the  longitudinal  accelerometer. 

6  Conclusions 

In  this  paper,  we  have  introduced  a  decentralized  fault  detection 
filter  that  provides  an  alternative  way  to  monitor  large-scale  sys¬ 
tems  for  faults.  The  resulting  filter  has  additional  fault  tolerance, 
because  it  can  check  the  health  of  its  contituent  sensors  prior  to 
deriving  the  top  level  estimate,  and  it  is  easily  scalable.  We  have 
also  introduced  a  logical  and  theoretically  rigorous  method  for 
decomposing  large,  global  systems  into  smaller,  local  ones  using 
minimum  realizations.  An  example  based  upon  the  linearization 
of  a  nonlinear  car  model  is  given  to  illustrate  our  results. 
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A  residual-based  scheme  is  presented  for  solving  the  radar  track-to-track  association  problem  using  bearings- 
only  measurements.  To  accomplish  track  association  between  two  stations,  the  residuals  of  a  bank  of  nonlinear 
filters  called  modified  gain  extended  Kalman  filters  are  analyzed.  Once  tracks  have  been  associated  between  two 
stations,  tracks  from  additional  stations  may  be  associated  with  trades  from  the  first  two  stations  by  checking 
algebraic  parity  equations,  Traditional  track  assodation  methods  rely  on  the  local  stations’  estimated  target 


only  measurements.  Our  method  bypasses  this  difficulty  because  our  filters  use  raw  data  from  multiple  stations. 
An  example  demonstrates  that  our  methods  yield  results  superior  to  those  of  standard  methods. 


I.  Introduction 

SUPPOSE  that  several  spatially  distributed  radar  installations 
are  each  tracking  several  targets.  Associating  a  given  target  to 
its  track  at  each  of  die  radar  stations  is  an  important  issue,  which 
the  radar  literature  refers  to  as  the  track-to-track  association  prob¬ 
lem.  Suppose  further  that  the  stations  use  passive  sensors  that  only 
measure  bearings  to  the  target,  without  measuring  range.  In  this  pa¬ 
per,  we  outline  a  strategy  for  solving  this  association  problem  by 
analyzing  measurementresiduals. 

Bearings-only  observation  functions  fall  into  two  special  classes 
of  nonlinear  functions,  called  modifiable  and  approximately  modi¬ 
fiable  nonlinearities,  which  are  defined  as  follows: 

Definition  L  A  time-varyingfunction  / :  W  W  is  called  mod¬ 
ifiable  if  there  exists  an  operator  A :  W  x  R*  R*  x  *  such  that  for 
anyx,  icelR*, 

fix)  ~  fix)  =  A[/(x),  x]{x  -  x )  (1) 

Definition  2.  A  time- varying  function  / ;  Rn  ->  is  called  ap¬ 
proximately  modifiable  if  there  exists  a  region  V  C  R”  and  operators 
A:W  and  £  :Rn  x  Rn  R" XB  such  that  for  any  x, 

xeDt 

fix)  -  fix)  =  [A(/(x),  x)  +  £(*,  x  -  x)]ix  -  x)  (2) 

where  limjjr  ^  ©  ||  £(*,*-  x)||/||  A(/(x),  x)\\  =  0. 

Song  and  Speyer’s  modified  gain  extended  Kalman  filter 
(MGEKF)1  is  a  globally  convergent,  unbiased,  nonlinear  observer 
for  systems  whose  measurement  functions  are  modifiable  or  ap¬ 
proximately  modifiable.  In  this  paper,  the  observers  we  design  for 
bearings-only  track  association  are  MGEKFs, 

An  early  attempt  at  solving  the  track-to-trackassociationproblem 
was  made  by  Singer  and  Kanyuek,2  In  their  paper,  they  incorrectly 
assumed  that  estimation  errors  local  to  each  station  were  uncorre¬ 
lated.  Bar-Shalom,3  Bar-Shalom  and  Fortmann,4  and  Bar-Shalom 
and  Campo5  later  corrected  this  error  by  accounting  for  the  correla¬ 
tion  between  the  local  estimation  errors  due  to  the  common  process 
noise  of  the  target.  Later  researchers  have  integrated  the  problem 
of  track  association  directly  into  the  process  of  separating  the  mea¬ 
surements  corresponding  to  actual  targets  from  clutter,6  7  In  all  of 
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these  references,  it  is  assumed  that  both  range  and  bearings  were 
measured.  In  some  of  these  references,  the  possibility  of  using  a 
MGEKF  to  handle  the  situation  of  bearings-only  measurements  is 
mentioned,  but  none  have  a  discussion  of  the  details  of  such  an  im¬ 
plementation, in  particular  problems  associated  with  die  asymmetry 
of  single  station  estimation  errors.  Estimates  based  on  bearings-only 
measurements  from  a  single  station  are  especially  uncertain  along 
the  line  between  the  target  and  the  receiver.  This  uncertainty  is  re¬ 
duced  when  measurements  from  physically  separated  stations  are 
used.  Our  method  attempts  to  take  advantage  of  this  phenomenonby 
using  estimates  constructed  from  several  stations’  measurements. 

The  paper  is  organized  as  follows.  We  show  in  Sec.  II  that 
bearings-onlymeasurement  functions  are  modifiable,  (Prior  results 
only  showed  that  they  were  approximately  modifiable.1)  We  then 
demonstrate  in  Sec.  HI  that  Incorrect  associations  between  two 
radar  stations  can  be  interpreted  as  sensor  faults,  so  that  a  bank 
of  modified-gain  fault  detection  filters  can  be  used  to  determine  the 
track  associations.  Section  IV  contains  the  main  result,  an  algo¬ 
rithm  for  solving  the  bearings-only  track  association  problem.  The 
application  of  this  algorithmic  an  example  in  Sec,  V  compares  our 
approach  to  a  conventional  track  association  method.  Section  VI 
concludes  the  paper. 

In  the  sequel,  inertial  Cartesian  coordinates  describe  the  motion 
of  each  target  in  three  dimensions  via  the  state  vector 

*  =  r  r  r  r  z*  x*  r  zff  <3) 

and  the  dynamics  of  each  target  are  assumed  to  be  of  the  form 

xHk  + 1)  =  Aik)x*{k)  +  Bik)w*ik)  (4) 

Note  that  we  include  an  acceleration  state  to  model  maneuvering 
target  dynamics. 


II,  Modifiability  of  Bearings-Only  Measurements 

Song  and  Speyer1  showed  that  the  azimuth  angle  azl  € 
[—tt/2,  it/2)  and  the  elevation  angle  el\  €  [— tt/2,  tt/2)  from  station 
s  to  target  t ,  as  shown  in  Fig.  1,  are  modifiable  and  approximately 
modifiable,  respectively.  The  region  T>  in  which  the  elevation  angle 
was  approximately  modifiable  excluded  an  ellipsoidal  region  near 
the  sensor,  making  their  algorithms  difficult  to  implement  for  situa¬ 
tions  where  the  angular  sensor  gets  close  to  the  target,  for  example, 
in  the  terminal  guidance  of  a  missile.  We  improve  this  situation 
somewhat  by  introducing  the  new  angle  4**  €  [ — tt/2,  it  f  2)  and  de¬ 
scribing  the  position  of  the  target  in  terms  of  4*j  and  <f>J  =  az\. 
Note  that  4>j  can  be  calculated  from  az\  and  eVs  via  the  equation 


(5) 


This  section  is  devoted  to  proving  that  the  measurement  function 
for  4^  is  modifiable. 

Let  x1  be  an  estimate  of  xt  and  assume  that  the  position  of  the 
measurement  station  in  inertial  space,  xs  —  [Ks  Ys  Zf],  is  known. 
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'  Radar  Station  (s) 


Target  (t) 


v  S  5  > 


Fig,  1  Angies  for  target  bearings. 

Then  X*,  F/,  Z',  Xj,  F/,  and  Z\  can  be  computed  by  taking  the 
difference  between  elements  of  x*,  x\  andx^. 

Suppose  that  station  s  measures  the  bearings  of  target  t  with  the 
measurement  vector  4*  Define  /*s(xf)  by 


A  & 

*.c*o  =  : 


The  measurement  residual  corresponding  to  hs(xl)  is  then 


=  ‘tan -«(r;/*0  -  tan-  {?>/x>)  1  ^  rtan-ia  1 
[_tan— (Z'/X')  —  tan— (Z'/xj)  J  Lton-0  J 

Applying  the  trigonometric  identity 

tan-1®  —  tan-1®  —  tan-1  [(a  —  b)/{  1  +  a&)] 


we  obtain 


tan  la 


i +(nM(fi/xi) 

l  +  (Z]/^)(Z|/x«) 


tan  1  a 
_tan—  j8_ 


\xiXi  +  rjrjJ 

_,fZ[Xj-Z'X'\ 

\X‘X'+Z‘Z‘J 


H(t)  = 


un(<J>')  — cos(<I>*)  0  0  0  0  0  0  0 

iin('Fj)  0  —cos(^)  0  0  0  0  0  0 


Let  dx  k  ^/[(Xp2  +  (F/)2],  D,  =4/[XiX;  +  F/F/]f  d2  4 
-v/[(Xp2  +  (Zp2)f  and  D24^2/[XJX'  +  Z^Zp.  Note  also  that 
sin(4»p  =  F//di,  cos  ($p  =  X*/du  sinOFp  =  Z'/d2,  and 


cos(^)  =  X^/^2*  Therefore,  we  can  express  Dt  and  £>2  as  func¬ 
tions  of  the  estimates  and  measured  angles: 

D,  =  D^,?)  =  l/[cos(<J>')X'  +sin(<J>')ZJ'] 

D2  =  D2(z's,x‘)  =  l/[cos(vl/;)x;  +sin(^;)zj] 

If  we  express  the  trigonometric  functions  in  H(zfg)t  Du  and  D2 
in  terms  of  X*,  F/,  Z',  XJ,  Fsf,  and  ZJ,  we  can  write  Eq.  (9)  as  a 
function  of  zl  and  x* : 


j8(z^x')J  L  0  >■*')! 

Finally,  we  can  rewrite  Eq,  (1 1)  as 
"-1  "I  _  l/a(z',i')  0  Z>, (z', *')  0 

-_1J  .  0  .  0  D2(z;,x 

and  combine  it  with  Eq.  (7)  to  obtain  hs  (xf)  in  modifiable  form 
Mb') -*,<?)  = 


]  (id 


o.fe.  *') 


£>1(4, x')  tan  1  <*fe,x>) 
a(zJ,F) 


Ptfo.x1)  tan-1  ^(z^x1) 

/>($•*') 


xH(z')[x'  —  x1] 

where  we  have  made  use  of  the  identity 


H(z?s)[xs  -  x1]  =  ® 


Thus,  we  have  replaced  the  elevation  an gle  e/p  from  which  Song  and 
Speyer1  produced  an  approximately  modifiable  function  with  a  new 
angle  Like  the  azimuth  angle  <£',  angle  leads  to  modifiable 
measurement  functions. 

IIL  Converting  Incorrect  Associations 
into  Sensor  Faults 

Suppose  that  station  s  can  view  several  targets,  indexed  by  i ,  and 
measures  the  bearings  of  each  target.  Then  each  of  these  measure¬ 
ments  4  is  generated  by  hs{xf\  as  in  Eq,  (6).  Now  suppose  that 
another  station,  using  its  local  observations,  generates  a  state  es¬ 
timate  of  one  of  the  targets  that  station  s  views.  This  estimated 
corresponds  to  x4  the  true  state  of  the  Jth  target  at  station  s,  but 
neither  station  knows  the  value  of  index  j.  Our  goal  is  to  determine 
which  of  the  tracks  at  station  s  is  the  jth  one,  using  only  {ts  h  the 
measurements  local  to  station  s ,  and  xJ\  the  other  station’s  state 
estimate  of  one  of  the  targets. 

To  this  end,  let  us  form  the  following  error  residual  between  the 
estimated  and  the  measurement 4*  making  use  of  the  result  from 
the  preceding  section: 

z‘s-hs(xJ)  =  hstf) -hs(xJ)  =  G(z!s,x1)(x‘  -x1)  (14) 

where  from  Eq,  (13) 

= 

Di(4,xj)tan~‘a(4,xl) 

a(zi,xl)  ° 

0  Pjfe.x^tan-lfc.x') 

0(4, xl) 

xff(z0  (15) 
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By  introducing  a  zero  term  into  the  measurement  residual*  we  can 
rephrase  it  as 

4  ~  hs(xJ)  =  hs(xl)  -  hs(xJ)  (16) 

4  ~  h^J)  =  G{4,x>)&  —  xJ)  (17) 

z!s-hs(xJ)  =  G(4»F*)(xf'  ~xj  +xJ  -xJ)  (18) 

4  ~hs&)  =  G(4>*j)(xj  -x/)  +  G(zfstxJ)(xf  -xJ)  (19) 

4  ~  hs(*J)  =  G(zis1  xJ)(xJ  -xJ)  +  §Jjf  (20) 

where  fxj  =  G(z!sfxJ)(xl  —  x*)  represents  the  difference  between 
x 1  and  xJ  as  a  sensor  fault.  If  i  =  j,  we  have  correctly  guessed  the 
assoeiationbetween  measurementand  estimate*  and  there  is  no  fault 
04*  =  0).  If  *  #  j\  then  /aJ*  #  0,  playing  the  role  of  a  sensor  fault  in 
the  residual, 

TV.  Algorithm  for  Track  Association 
from  Bearings-Only  Measurements  via  Fault 
Detection  Filters 

Suppose  that  there  are  S  radar  stations,  with  known  inertial  coor¬ 
dinates*  that  make  bearings-only  measurements  in  three-space  of  T 
different  targets.  We  assume  that  all  measurements  at  each  station 
have  been  grouped  as  tracks  of  each  target  visible  at  that  station 
using  conventional  means.4,8,9  In  this  section*  we  propose  an  algo¬ 
rithm  for  associating  the  tracks  at  all  stations  to  their  corresponding 
targets. 

Assume  that  each  measurement  station  s  is  located  at  known 
inertial  coordinates  (Xs,  Yst  Zs)„  Let  xli  denote  a  fault  detection 
filter’s  estimate  of  the  target  corresponding  to  the  ith  track  at  the 
first  station.  The  bearings-onlymeasurementfunctionfor  the  station 
s  of  the  same  target  is  thus 


hs(xli)  = 


<m\ 


z>  -h,(xu)**G(zj,xli)( xu -Xu)  +  fi'/  +vJ  (21) 
where  G(zi ,  xu )  is  given  by  Eq.  15  and  the  sensor  noise  is 
v1  =  V(0,  V>) 

The  approximatestructureof  Eq.  (21 )  is  due  to  thereplacementof  the 
measurement  function  in  G(%  •)  with  the  actual  measurement  (see 
Song  and  Speyer1),  Note  that*  by  default,  $  =0,  Vi  =  1, . , ,  *  T. 

The  following  algorithm,  illustrated  in  Fig,  2,  associates  tracks 
between  stations. 

Algorithm  (track  association): 

1)  Let /  =  1. 

2)  Run  a  bank  of  T  detection  filters  that  operate  on  data  from 
stations  1  and  2*  where  the  Jth  filter  attempts  to  detect  fx Each 
filter  is  constructed  using  the  dynamic  detection  filter  procedure 
given  next.  All  but  one  of  these  detection  filters  should  register  a 
fault.  The  track  corresponding  to  th e  filter  that  detected  no  fault  is 
associated  with  Without  loss  of  generality, label  this  track  4. 

3)  For  each  trackzi*  s  =  3, , , . ,  S,  l  =  1, . , . ,  T,  perform  the  al¬ 
gebraic  parity  test  given  subsequently.  If  the  result  of  the  parity  test 
is  zero*  then  zfs  is  associated  with  tx  and  iv 

4)  If  t  <  T,  increment  i  by  1  and  go  to  step  2,  If  i  =  T,  we  have 
completed  the  track  association  procedure. 

Note  that  estimates  obtained  in  step  2  are  used  in  step  3,  Therefore, 
stations  1  and  2  should  be  chosen  to  maximize  observability  of  the 
targets. 

Dynamic  Detection  Filter 

For  any  estimator  of  xh  *  the  estimation  residual  determined  by 
the  measurements  z\  andz^  will  not  converge  to  values  near  zero 
unless  zj  andz|  correspond  to  the  same  target.  One  such  estimator  is 
the  MGEKF1  given  as 


From  the  results  of  the  preceding  section ,  die  error  residual  of  track 
J  at  any  station  sf  generated  by  target  i  at  station  1,  is  given  by 


xli(k  +  l)=A(k)xu(k) 

[<(*)- A,  (i»(fc)] 
{zi(k)-h2[xu(k}] 
xli(k)  =xu(k)  +K'j(kyJ(k) 

M“(k  +  1)  =A(k)P“(k)AT{k)  +  Q(k) 


(22) 

(23) 

(24) 

(25) 


f  “  -  Fi 


{l  +  [(FK-F1)/(x«-X1)]2j 
Z‘j  -  z. 

(x>'-xt)2 

{i  +  [(z«-z1)/(*«-x1)]2] 
fu  -  f2 

(xi'-x,)2 

{i  +  [(f»-f2)/(x«-x2)]2] 
z‘>  -  z2 

(X>'-X2)2 

ji  +  [(zo-z2)/(x»-x2)]2 

0 

1 

|(X--X2)2 

jl  +  [{z>i  -  Z,)/(x»  -  Xj)]2 J(xu  -  X,)2 

0 

1 

jl  +  [(z<J  -  z2)/(xu  -  x2)]2j(x»  -  x2)2 

_ 1 _ 

J 1  +  [(?«  -  F,)/ (x»  -  X,)]2}(x«  -  X$ 

0 

_ 1 _ 

j  1  +  [(?*<  -  f2)/(x«  -  x2)]2) (x><  -  x2f 

0 

000000" 

0  0  0  0  0  0 

0  0  0  0  0  0 

0  0  0  0  0  0 


(26) 
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Table  1  Radar  station  positions 


Station 

identification 

x  Position, 
m 

y  Position, 
m 

z  Position, 
m 

1 

50 

1 

0 

2 

50,000 

I 

50 

3 

25,000 

-400 

100 

Track! 
from  station  l 


Trades  from 
station  s 


Track  from  | 
station  2  that  I 
best  matches  * 
track  i,  station! 


Bank  of 
algebraic 
parity  tests 


Fig*  2  Track  association  procedure. 


Track  from 
stations  that 
best  matches 
track  i,  station! 


KiJ(k)  +  ?«(*)]' 


G[Aik),^{k),  *"(*)]  = 


G(z\(k),x'i(k)) 


|_G(z^(fc),  (*:))  J 

FHk)  =  {/  -  K“  (k)G[z!x  (*),  zl  (k),  x1'  (*)]  ]^  (k) 


where 

"0  00100  0  0  0 

000010  0  0  0 

000001  0  0  0 

000000  10  0 

F~  0000000  1  0 

0000000  0  1 

000000  -a  0  0 

000000  0  -a?  0 

.0  00000  0  0  -a 

“0  0  0" 

0  0  0 
0  0  0 
0  0  0 

r4  o  0  0  (34) 

0  0  0 
1  0  0 
0  1  0 
.0  0  V 

and  where  w  is  a  zero  mean  Brownian  motion  process  with  covari¬ 
ance  is  *3  and  a  =  —  is  the  time  constant  for  the  first-order  filters 
that  model  target  maneuvers  as  colored  noise  processes.  We  sample 
this  model  at  intervals  of  T  —  0.1  s  to  generate  the  discrete  time 
dynamics 

x(k  +  1)  =  Ax(k)  +  Bw(k)  (35) 


+lCj(k)(ViJrl(k)(gJ)T(k) 


(29)  where 


where 

Vij(k)  =  diag{F,  V7}  (30) 

The  weighted  innovations  process  of  the  MGEKF, 

^(k)  =  [hx>lik)M>Hk)hTxum  +  VJ(k)f^(k)  (31) 

should  be  close  to  a  zero-mean,  unit  variance  white  noise  sequence 
only  if  z!x  and  z{  correspond  to  die  same  target. 

Algebraic  Parity  Test 

Ibis  test  determines  if  zj,  S  >  s  >  2,  T  >  l  >  1 ,  is  associated  with 
z\  and  4*  where  zf  andz|  are  already  known  to  be  associated  with 
each  other.  Suppose  that^u  is  the  state  estimate  generatedby  gx  and 
4*  Then,  if  4  is  associated  with  the  tracks  z\  and  4» 

v(k)Y  =  [hxumMiJmTx«m  +  rHkjf  {z's(k)  -  hs[xu(m} 

(32) 


A  =  eFT,  B=  I  eF‘Bdt,  £[*(*)]  =  03  „ , 

j  0 

E{w(k)wT  (/)]  =  h  x  3  4<  (36) 

The  targets  began  the  simulation  with  the  initial  conditions 

*i  (0)  —  [50  220,000  30,000  250  -1000  0  0  0  Of 

x2(0)  =  [50,000  20,000  35,000  -250  1000  0  0  0  Of 

This  configuration  corresponds  to  the  two  targets  initially  moving 
directly  toward  each  other,  in  a  linethat  almost  passes  through  station 
2.  In  the  simulation,  they  pass  closest  to  each  other  at  t  =  99,2  s. 
Each  measurement  station  measures  the  angles  $>!s  and  to  each 
target  at  every  sample  time.  These  measurements  are  subject  to 
additive,  normally  distributed  zero-mean  white  measurement  noise 
with  standard  deviation  1  deg.  We  assume  that  the  measurement 
noise  is  independentbetween  sensors  at  all  stations.  Each  MGEKF 
begins  with  the  a  priori  information 

xli(0)  =  [25,000  120,000  32,500  0  0  0  0  0  Of 


should  be  close  to  a  zero  mean,  unit  variance  white  noise  sequence. 
Here,  the  approximate  measurement  matrix  hxum  is  computed  in  a 
manner  similar  to  the  first  two  rows  of  the  matrix  in  Eq.  (26),  but 
referenced  to  (XSi  Yst  Z,),  the  location  of  station  s,  instead  of  the 
location  of  the  first  station  (Xu  Yx ,  Zj).  The  algebraic  parity  test  is 
simply  to  evaluate  the  parity  equation  (32). 

V.  Example 

The  track  association  algorithm  presented  in  the  last  section  is 
applied  to  simulation  data  in  this  section.  Three  radar  installations 
were  located  at  the  positions  given  by  Table  1,  and  two  targets 
were  both  modeledas  ninth-orderlinear  time-invariantdiserete-time 
systems  with  the  dynamics 


PiJ(0)  =  107  x  J9x9 

Finally,  we  assume  that  the  local  stations  were  able  to  separate  their 
measurements  from  clutter  perfectly  using  methods  like  those  of 
Reid9  orBar-ShalomandFortmann,4  orFortmannandBar-Shalom.8 

Figure  3  plots  the  weighted  innovations  of  a  MGEKF  that  uses 
measurements  from  stations  1  and  2  that  correspond  to  the  second 
target,  whereas  Fig,  4  plots  the  weighted  innovations  of  a  MGEKF 
that  uses  measurements  that  are  mismatched.  Note  that  the  inno¬ 
vations  for  the  correct  match  appear  to  be  a  zero  mean  white  noise 
sequence,  whereas  the  innovations  for  the  incorrect  match  are  larger 
and  are  not  white.  To  better  observe  the  behaviorof  these  sequences, 
their  means  were  estimated  using  a  Kalman  filter  (assuming  that  each 
element  of  the  weighted  innovation  of  the  MGEKF  was  a  measure¬ 
ment  of  a  process  that  had  integrator  dynamics,  process  noise  with 


x(t)  =  Fx(t )  +  Tw{t) 
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Fig*  9  Error  statistic  suggested  by  Bar-Shalom?  and  Bar-Shalom  and 
Fortmann4;  (*i  —  x%)TE[ixi  —  xz)$t  —  —x2X 
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h 
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Euclidean  norm  of  error  in  tracking  target  1. 

were  several  instances  where  nearly  singular  matrices  were  inverted 
in  the  algorithm  that  computes  the  covariance  of  the  difference 
between  two  local  estimates* 

Part  of  the  reason  for  this  difficulty  is  explained  in  Fig*  10,  a 
plot  of  the  Euclidean  norm  of  the  estimation  error.  The  solid  line 
corresponds  to  a  MGEKF  that  uses  measurements  from  both  station 
1  and  2,  whereas  the  dotted  line  is  from  a  filter  that  only  used  station 
1  measurements.  Any  method  that  relies  on  estimates  that  only  use 
a  single  station’s  measurements  is  subject  to  a  large  error.  This  is 
not  a  huge  concern  for  linear  estimators,  but  the  matrix  Pij  defined 
by  Eq,  (29)  may  not  necessarily  reflect  this  error. 


We  have  also  encountered  cases  where  a  single  station  measure¬ 
ment  MGEKF  was  divergent  in  the  radial  direction  to  the  target,  but 
no  such  difficulties  have  appeared  when  data  from  two  geograph¬ 
ically  disparate  stations  was  used.  One  way  of  generating  such  a 
divergent  case  was  to  decrease  the  maneuver  colored  noise  auto¬ 
correlation  parameter  a  to  ^  or  below.  We  note  that  values  of  this 
parameter  below  —  corresponds  slower  maneuvers,  a  commonly 
encountered  situation* 


VL  Conclusions 

This  paper  describes  residual-based  techniques  for  solving  the 
radar  track  association  problem  for  bearings-only  measurements. 
The  association  between  the  tracks  at  two  stations  can  be  deter¬ 
mined  by  examining  the  residuals  of  a  bank  of  MGEKFs.  Once  this 
association  is  established,  an  algebraic  parity  test  can  find  the  cor¬ 
respondence  between  tracks  at  other  stations  and  targets  tracked  by 
the  first  two  stations. 

One  may  ask  why  detection  filters  are  necessary:  Why  not  do 
everything  with  algebraic  parity  tests?  Although  the  detection  fil¬ 
tering  step  is  not  strictly  necessary,  it  does  improve  the  quality  of 
the  track  associations  because  the  state  estimates  constructed  from 
two  widely  separated  stations  are  so  much  more  accurate  than  the 
estimates  from  a  single  station. 

To  ensure  the  quality  of  the  estimates  from  the  MGEKFs,  one 
could  delay  the  algebraic  parity  testing  steps  for  associating  tracks 
from  additional  stations.  If  these  parity  tests  are  replaced  with  ad¬ 
ditional  detection  filter  banks  until  the  estimates  before  and  after 
including  a  new  station’s  measurements  are  sufficiently  close,  then 
the  fidelity  of  the  estimates  can  be  guaranteed. 
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ABSTRACT 

Although  the  exact  GPS  solution  proposed  by  Bancroft 
is  nonlinear,  it  may  be  manipulated  into  a  linear  form 
when  5  or  more  satellites  am  visible.  This  linear  form 
is  exact,  as  opposed  to  the  linear  solution  obtained  via 
repeated  linearization  in  die  iterated  least  squares  (ILS) 
method.  By  virtue  of  this  exactness,  the  solution  of 
the  linear  form  is  always  the  true  user  position,  while 
the  ILS  may  converge  to  an  incorrect  solution  (this  is 
especially  common  when  the  GPS  user  is  in  space). 
When  the  measured  pseudoranges  are  noisy,  the  linear 
structure  ensures  that  the  position  estimate  will  converge 
to  the  correct  value  and  that  the  error  covariance  of  the 
estimate  is  known,  guarantees  that  have  not  been  found  for 
nonlinear  estimators  that  use  the  Bancroft  solution  directly. 
The  conversion  to  the  linear  form  excludes  information 
present  in  a  single  scalar  nonlinear  measurement  equation. 
We  demonstrate  several  procedures  for  refining  the  linear 
estimate  with  this  remaining  information.  In  addition, 
we  show  that  the  methodology  developed  for  direct  GPS 
solutions  can  be  applied  to  create  linear  direct  methods  for 
differential  GPS  problems. 


1  INTRODUCTION 

The  purpose  of  the  NAVSTAR  global  positioning  system 
(GPS)  is  to  allow  a  user  to  accurately  determine  their 
three-dimensional  position.  The  system  consists  of  a 
constellation  of  satellites,  each  of  which  broadcasts  a 
predetermined  time-varying  code,  known  in  advance  by  the 
user.  The  user  can  thus  calculate  the  delay  between  the 
broadcast  time  from  each  satellite  and  the  time  of  reception, 
which  translates  into  a  pseudorange  from  the  receiver  to 
that  satellite.  Since  the  positions  of  the  GPS  satellites 
are  accurately  known,  these  pseudoranges  can  be  used  to 
triangulate  the  user  position.  Because  the  user’s  clock  does 
not  align  precisely  with  the  clocks  on  the  GPS  satellites,  the 
measured  pseudoranges  are  not  true  ranges,  and  therefore 
the  error  between  the  user  clock  and  GPS  time  must  be 
estimated  in  order  to  accurately  determine  the  user  position. 

The  pseudoranges  are  nonlinear  functions  of  four 
unknowns:  the  user  position  in  three-space  and  the  user 
clock  error.  Determining  the  unknowns  therefore  requires 
the  pseudoranges  from  at  least  four  non-coplanar  GPS 
satellites.  In  view  of  this  restriction,  the  orbits  of  the 
satellites  in  die  GPS  constellation  were  designed  so  that 
most  earthbound  users  view  at  least  four  GPS  satellites  at 
any  given  time. 

The  most  commonly  used  method  for  determining 
the  unknowns  is  an  iterated  least  squares  (ILS)  method, 
e.g.  the  one  in  Ref,  [1],  This  method  amounts  to  a 
gradient  search  procedure  to  minimize  the  error  between 
the  measured  pseudoranges  and  pseudoranges  constructed 
from  the  estimated  unknowns. 

If  the  pseudoranges  are  noiseless,  one  may  solve  the 
pseudorange  equations  directly  for  the  unknowns,  as  first 
reported  by  Bancroft  [2],  This  direct  method  was  later 
refined  and  adapted  to  the  case  of  noisy  measurements 
[3,  4,  5,  6].  The  estimates  obtained  by  each  of  these 
methods  all  depend  on  the  solution  to  a  quadratic  equation. 

In  this  paper,  we  manipulate  the  measurement 
equations  into  a  linear  form,  which  is  not  dependent  on 


approximate  knowledge  of  the  user  position.  It  is  thus  a 
direct  method,  but  one  that  does  not  require  the  solution 
of  a  set  of  nonlinear  measurement  equations.  Although 
Leva  [7]  has  previously  shown  that  such  a  linear  structure 
existed,  he  used  it  solely  for  the  purpose  of  determining 
when  unique  solutions  to  the  GPS  equations  existed.  In 
this  work,  we  exploit  the  linear  structure  to  obtain  better 
estimates  for  systems  with  measurement  noise. 

The  notation  is  as  follows.  The  column  vector 
denotes  the  position  of  the  ith  satellite  in  an  inertial 
reference  frame,  and  the  user  position  in  the  same  reference 
frame  is  a  column  vector  denoted  by  x.  The  error  between 
the  user's  clock  and  GPS  time  is  denoted  by  At.  This 
clock  error  causes  a  bias  of  cAt  in  all  of  the  pseudoranges, 
where  c  denotes  the  speed  of  light.  X  =  [xT  cAt]T 
denotes  the  vector  of  unknowns  for  the  GPS  problem.  The 
operator  <  *,  *  >  denotes  the  Euclidean  inner  product,  and 
the  operator  ||  *  ||  indicates  the  Euclidean  norm.  E[]  is  the 
expectation  operator.  The  number  of  GPS  satellites  visible 
to  the  user  is  m, 

A  review  of  the  ILS  algorithm  appears  in  the  next 
section,  followed  by  a  review  of  the  direct  solution  in 
Section  3.  Section  4  describes  our  linear  direct  method. 
An  extension  of  this  method  is  made  to  the  differential 
GPS  problem  in  Section  5.  Because  the  linear  direct 
method  does  not  use  all  of  the  information  present  in 
the  measurements,  we  propose  several  methods  in  Section 
6  to  improve  upon  the  linear  direct  method's  estimate 
by  using  the  information  present  in  the  scalar  nonlinear 
measurement  equation  that  was  excised  in  the  construction 
of  the  linear  direct  method.  Section  7  compares  the 
accuracy  of  several  direct  methods  operating  on  a  simulated 
data  set,  and  Section  8  concludes  the  paper. 

2  REVIEW  OF  THE  ILS  ALGORITHM 


For  the  zth  visible  GPS  satellite,  the  measured  range  from 
the  user  to  the  satellite  is  given  by 


pW  =  ||SW-x||+cAi  +  T?W. 

(1) 

Let  us 

denote  /(S»,  X)  A  ||S«  -  x||  +  cAt.  When 

equation  (1)  is  linearized  about  nominal  values  of 
cAU  and  x*,  it  becomes 

<5/5(<>  =  F(S<1\X)<JX  +  #, 

(2) 

where 

Sp<i]  =  p(i)  -  pi4)  =  P(i)  -  /( S«  X.), 

(3) 

<5X  =  X  —  X,, 

(4) 

and 

F(S»,X) 

(5) 

The  ILS  algorithm  proceeds  as  follows: 


Algorithm  2.1  (ILS). 

1.  Set  j=0.  Let  Xq  be  the  initial  estimate  ofX. 

2.  Set  a  convergence  tolerance  e. 

3.  Let  W  be  the  covariance  of  the  noise  vector  rj  = 

[^C1)  ^<2)  _ 

4.  Compute  F(S(*>,  X^),  i  =  1, 2, . . . , m. 

5.  Use  the  measurements  . , . ,  and  Xj  to 

compute  5pll\6p@\ . . . ,  Sp^  via  equation  (3). 

6.  Find  the  weighted  least  squares  estimate  ofSX  using 
the  formula 

6X  =  (F^W-1F)”1FTW~~1<5p,  (6) 

where 


rp(i)i 

-FCSW.X,)] 

p(2) 

,  F  = 

F(S<2),X,-) 

_F(S<m\j^). 

7.  Update  the  estimate  using  the  relation  X^+i  =  Xj  + 

5X. 

8.  If  |<SX|  <  e,  the  estimates  have  converged  to  within 
the  tolerance  e  and  the  algorithm  can  stop .  If  not,  let 
j  =  j  4-  1  and  return  to  step  4. 

Note  that  the  initial  guess  X0  and  the  convergence 
tolerance  e  determine^  the  number  of  iterations  that  the 
algorithm  requires.  If  X0  is  a  bad  guess,  the  algorithm  may 
converge  very  slowly. 

3  REVIEW  OF  DIRECT  METHODS  FOR  NOISY 
MEASUREMENTS 

This  section  presents  a  brief  review  of  the  Improved  Direct 
Solution  (CDS)  developed  by  Biton,  et  al  [6],  which  is 
an  adjustment  of  Bancroft's  direct  method  [2]  to  handle 
noisy  data.  Begin  with  the  equation  for  the  measured 
pseudorange  from  the  user  to  the  ith  satellite 

pW  =  ||S»-x|I+cAt  +  r?W)  (7) 

or  equivalently 

||S«-x||  =  p«-cAt-77».  (8) 

Squaring  this  equation  and  rearranging  terms  yields 

-  2  <  SW,x  >  +x  +  2p^cAt  -  (tj^)2  + 

2p<iV°  -  2»j«cAf  =  (p«) 2  -  ||gW ||2,  (9) 


where 


4  THE  LINEAR  DIRECT  EQUATIONS 


X  —  INI2  —  (cAt)2.  (10) 

Typical  values  of  the  quantities  in  equation  (9)  are 

||S«||  «  10 Tm,  ||a;||  «  105m,  c&t  «  105m, 

??(i)  »  lm,  »  107m. 

If  we  ignore  all  terms  in  (9)  that  are  smaller  than  10s 
meters,  we  have 


-  2  <  S(i),  x  >  +x  +  2pCi)cAf  +  2 p<*Vi}  = 

(p«)2-||S«||2.  (11) 

We  can  assemble  a  vector  equation  by  applying  (1 1)  to  the 
measurements  from  every  satellite: 


HX  4-  Gt)  —  Ra  4-  (12) 


where 


H  =  2 


'-(S(1))T 

-(S^)T 

_(S(*))T 


{Pwf  - 

1!S(1)H2 1 

-1“ 

R«  = 

(p(2))2_ 

j|S00||2 

,  R  6  = 

-1 

_(p(m>)2  - 

i|S(m)H2_ 

-1 

G4diag{2p«,2^2>,...,2p(m)}, 


T)  =  T)<®  ...  7j(m)jT . 


(13) 


(14) 

(15) 

(16) 


Since  G  is  invertible,  equation  (12)  can  be  rewritten  as 

G-1UX  +  v  =  G-1Ra+xG-1Rb.  (17) 


Suppose  that  77  were  a  zero  mean  Gaussian  vector  with 
covariance  V,  If  x  was  a  known  quantity,  and  if  the  a 
priori  estimate  of  X  was  the  zero  vector,  and  if  the  error 
covariance  of  this  initial  estimate  was  infinite,  then 


X  =  {(G-1H)rV-I(G-1H)]-1  • 

(G-1H)tG-1(R«  +  xR*)  (18) 

is  the  least  squares  estimate  of  X,  Substitution  of  the 
elements  of  X  in  (18)  into  the  definition  of  x  (10)  results  in 
a  quadratic  equation  in  the  unknown  %  (the  coefficients  of 
this  quadratic  equation  are  independent  of  X  and  x*  hence 
the  nomenclature  “direct  solution”).  Substituting  the  two 
solutions  of  this  equation  into  (18)  gives  two  candidates 
for  X,  Only  one  of  these  candidates  will  satisfy  the 
measurement  equation  (12). 

Note  that  for  the  noiseless  case,  the  estimation 
equation  (18)  simplifies  to 

X  =  [HTH]-lHT(Ra  +  xR b),  (19) 

which  is  the  estimation  equation  used  in  the  original 
Bancroft  method. 


The  only  nonlinear  term  in  the  vector  measurement 
equation  (17)  is  x*  We  demonstrate  the  means  to  remove 
this  nonlinearity  below. 

The  matrix  G^R^  is  rank  1.  Hence,  a  rank  (m  -  1) 
matrix  E  exists  such  that  EG-1Rfc  =  0.  In  fact,  there  are 
an  infinite  number  of  such  annihilator  matrices,  with 


1-10  0 " 

10-1  0 


10  0  -1 


being  an  obvious  example.  Multiplying  the  GPS 
measurement  equation  (17)  on  the  left  by  the  annihilator  E 
creates  (m  —  1)  completely  linear  exact  GPS  measurement 
equations: 


EG_1HX  +  Er?  =  EG-1Ra,  (20) 

Remark  4.1.  If  the  noise  vector  r\  is  assumed  to  he  a  zero 
mean  Gaussian  with  covariance  V  A  E[qr}T]i  the  single 
epoch  least  squares  solution  of  the  linear  measurement 
equation  (20)  that  minimizes  the  cost  function 


J  =  [EG“xRa  -  EG~1HX]T(EVET)-1  * 

[EG-IRtt  -  EG_1HX]  +  XtMq  1 lX  (21) 


is  given  by 


Xo  =  [HT(G"1)rET(EVET)-:lEG-1H+Mo 1]-1- 
Ht(G~1)tEt(EVEt)-1  • 

(EG_1Ra  -  EG-1HXo),  (22) 

where  X0  is  the  a  priori  state  estimate  and  M0  is  the 
(potentially  infinite)  a  priori  error  covariance .  The  error 
covariance  P0  =  E[(X  -  X0)(X  -  X0)T]  of  the  estimate 
is 


P0  =  (HT(G~1)TET(EVET)~1  * 

EG^H  +  Mo1)-1.  (23) 

Note  that  this  error  covariance  will  typically  be  larger 
than  that  for  an  ILS  estimate ,  since  the  projected 
measurements  have  effectively  double  the  noise  of  the 
original  measurements.  Hence ,  using  the  estimate  obtained 
in  (22)  as  the  initial  guess  X0  in  the  ILS  algorithm  2.1 
would  be  a  good  strategy.  Since  X0  is  very  close  to  the 
true  position,  the  ILS  algorithm  should  converge  in  a  step 
or  two.  Section  6  derives  a  similar  method  that  gives  even 
better  results. 

If  a  state  model  is  available ,  a  Kalman  filter  can 
be  constructed  for  estimating  the  vector  X.  Assuming 
that  the  measurement  noise  sequence  rj  is  an  independent 


and  identically  distributed  (U.d.)  zero  mean  Gaussian 
random  sequence  with  covariance  V,  equations  (22)  and 
(23)  describe  the  update  equations  for  such  a  Kalman  filter 
Note  that  an  extended  Kalman  filter  is  not  required,  as  all 
of  the  measurement  equations  are  linear. 

5  EXACT  LINEAR  SOLUTION  OF  NOISY  DIF¬ 
FERENTIAL  GPS  EQUATIONS 


Let  us  expand  the  terms  on  the  right  hand  side  containing 
the  clock  biases,  giving  us 

-2<S(<)1Ax>+||x2||2-||xl||2  = 

2(pf)-#)(5pW-cAt12-4>)- 

2cA<i<Sp W  2cAficAti2  +  2cAfi?7^  + 

(Sp{l)  ~  Vn?  ~  2cAtu(Sp «  -  ij$)  +  (cAf12)2,  (28) 


Suppose  that  we  are  interested  in  knowing  the  distance 
Ax  =  x2  -  X!  between  two  receivers  located  at  the 
positions  xi  and  x2.  Let  cAti  be  the  clock  bias  of  first 
receiver,  and  let  cAt2  be  the  clock  bias  of  the  second. 
Define  cAt\2  as  the  difference  between  the  two  clock 
biases,  cAfi2  =  cA t2  -  cAfi. 

For  each  satellite  i  that  is  visible  to  both  receivers, 
there  are  two  measurements  available:  pf*  =  ||S^  — 
XiH  4-  c&t  1  +  qf  and  Sp&  A  ||S«  -  x2(|  -  ||S«  - 
xi||  4  c&ii2  4  riyl  *  Note  that  q[t}  and  are  correlated, 
but  the  magnitude  of  4i2  15  less  than  that  of  q^  due  to  the 
elimination  of  common  mode  errors  from  the  differential 
measurement.  If  Sp^  is  a  differential  carrier  phase 
measurement,  then  q[ %  should  be  very  small  compared  to 

Vi  \  since  carrier  phase  multipath  is  much  smaller  than 
code  multipath. 

Proposition  5.1.  The  distance  between  the  receivers  Ax 
satisfies  the  following  equation  for  each  visible  satellite  i: 

-  2  <  S«  Ax  >  +||x2||2  -  Hxili2  +  2pfcAtl2  + 
28p^cAh  +  25p{i)cAt12  -  2cAtxcAt12  -  (cAt12)2  + 

2 P(i)rii2  ~2cAt12r,f  -2r7(<)i7(i2)-2cAti4)+25pWj?<’) - 
(jjg)2  -  2cAf124)  =  2 pfSpW  +  (SpWf.  (24) 
Proof  Begin  by  noting  the  identity 

I|S(<)-X2||2  = 

(lis(<)  -  Xill  +  (||SW  -  x2||  -  ||SW  -  xi||))2.  (25) 
Expanding  the  quadratic  terms  on  both  sides  yields 

||S«[|2_2<S«,x2>+||x2!|2  = 

i|SW||2-2<S«x1>+||x1||2  + 
2||S(<)-x1||(I|S(<)-x2||-||S«-x1||)  + 

(||S«-x2|l-||SW-x1||)2.  (26) 

Now  substitute  the  definitions  of  Ax,  p[,!  and  5p(i)  into  the 
above  expression  and  rearrange: 


which  can  be  rearranged  as  (24).  □ 

We  can  assemble  a  vector  equation  by  applying  (24) 
to  the  measurements  from  every  satellite: 


H^AX  -{-  G^77d  4  R cd  =  Rad  +  Xd&bd, 


where 


-(S«)T  SpW  (pM  +  SpM) 
-(S^)T  6p&  (pW  +  8pW) 

l_-(S(m>)T  <5p(m)  (p(m)  +5p(m))j 


AX  A 

"  Ax  " 

cAti 

«  a 
?  Vd  ~ 

r^i 

_cAti2_ 

Gd  =  2diag {(p«  +  SpW)t  (p(2)  +  Sp( 2)), . . . 

,(pW+5p(m))}t 

Rcd  -  -2cAtl2rfp  -  2 ~  ^cAhVil  ~ 

{Pi2?  ~  2cAt12r,{t  i  =  1, 2, . . . , m 


Red  - 

i 

to  k) 

'*  s.SS-8 

,  .  Rid  = 

*  ^  -1 . Tr 

rH  r-4 

1  1  ** 

1 _ 

L-iJ 

R  ad  = 


2  p{p6pW  +  (dp«)2 
2  p|2)Jp<2>  +  (5p(2))2 


(29) 

(30) 

(31) 

(32) 

(33) 

(34) 

(35) 


l_2p(1m)^P(7n)  +  (<5p<m>)2] 

Xd  =  ||x2||2  -  ||xi||2  -  2cAt1cAti2  -  (cAt12)2.  (36) 

Proposition  5.2.  AX,  the  solution  to  the  vector  equation 
(29),  satisfies  the  rank  (m  —  1)  vector  equation 

EdGd  1HyAX  4  E dTfd  4  xRCd  = 

EdG^1^,  (37) 


-2<sW)Ax>+||x2||2-||x1||2  = 

2(p«  -  cAtx  -  Vf)(Sp(i)  -  cA t12  - 1§)  + 

(5p«-cAf12-4})2-  (27) 


which  is  linear  in  the  unknowns  Ax,  cAt\,  and  cAf12. 

Proof  As  for  the  standard  GPS  case,  there  exists  a  rank 
(to  -  1)  left  annihilator  Ed  to  the  rank  1  vector  G^XR M. 
Multiply  (29)  on  the  left  by  EdGjx  to  obtain  (37).  □ 


Remark  5.3  (Approximate  Linearity),  Equation  (37)  is 
already  linear  in  the  unknowns  Ax,  cAti2,  and  cAti,  If 
we  make  some  mild  approximations ,  we  can  remove  the 
nonlinearities  in  the  noise  terms .  The  elements  in  (37) 
typically  are  of  the  following  sizes: 

||S^||  «  107m,  &  10 7m,  cAti  &  10 5m, 

cAti2  ~  105m,  «  lm,  ^  ~  lm, 

0m  <  ||Aa?||  <  107m?  Om  <  ||5p^||  <  107m. 

If  we  choose  an  unitary  annihilator  (i.e.  = 

0,  EdEj  —  Im_i.  Such  an  annihilator  always  exists,) 
and  ignore  all  terms  in  (37)  t/zaf  are  smaller  than  10  meters , 
we  /zave 

ErfGj1H£|AX  +  E^d  =  EaGj1Ra^J  (38) 

which  is  linear  in  the  unknown  noises  as  well  as  in 
Ax,  eAti2,  and  cAt\.  Note  that  this  formulation 
holds  even  for  very  long  baselines ,  In  contrast ,  the 
linearization  conventionally  used  for  the  measurement 
equation  introduces  significant  errors  when  the  distance 
between  the  two  antennas  is  larger  than  100  kilometers 
18],  It  should  be  noted  that  terrestrial  differential  GPS  over 
long  baselines  is  still  subject  to  larger  noises  than  those 
associated  with  shorter  baselines ,  due  to  noncommon  mode 
ionospheric  errors, 

6  NONLINEAR  CORRECTION 

In  the  course  of  the  derivation  of  the  linear  exact  solution, 
some  of  the  information  in  the  measurement  vector 
G_1Ra  is  not  used,  namely  the  part  of  G_1Ra  that 
is  orthogonal  to  the  projector  E.  This  information  can 
be  recovered  by  applying  Rj’(G“1)T,  the  orthogonal 
complement  of  E,  to  the  nonlinear  measurement  equation 
(17),  which  yields  the  following  scalar  nonlinear  equation: 

HX  4-  Grj  —  RG  -j-  R^x*  (39) 

where 

*.±Rg,(G-1)rG-lR0,  (40) 

Rfc  —  Rjf(G-1)TG-1B-fc,  (41) 

H  4  Rf (G_1)tG_1H,  G  4  Rf(G-1)T.  (42) 

Remark  6.1  (Independence  of  measurement  noise). 

Note  that  if  the  measurement  noise  77  is  assumed  to  be 
composed  of  independent  and  identically  distributed 
zero  mean  Gaussians  (i.e.  E[r])  =  0,  E[rjrj]  =  a2 1) 
then  R^(G_1)T77  is  uncorrelated  to  the  noise  in  the 
measurement  equation  (20); 

E[ Etj(Rf  {G-1)tt))t\  =  E[  Ej777tG-1R6]  (43) 

=  E(<r2I)G_1Rfc  (44) 

=  j2EG_1R[  =  0,  (45) 


This  means  in  particular  that  Rjf  (G-1)rj7  is  independent 
of  Xo,  the  estimate  based  on  the  measurement  equation 


(20). 


The  independence  of  X0  and  the  noise  in  (39) 
suggests  several  methods  for  adding  a  correction  to 
the  linear  estimate  based  on  the  nonlinear  part  of  the 
measurement  equation.  Some  possible  corrections  are 
a  linearization  of  the  nonlinear  equations,  a  minimum 
variance  linear  estimate  based  on  the  nonlinear  part,  a 
maximum  likelihood  estimate  based  on  the  nonlinear  part, 
and  a  conditional  mean  estimate  based  on  the  nonlinear 
part.  The  following  subsections  detail  each  of  these 
methods. 

6.1  LINEARIZATION  OF  NONLINEAR  PART 
When  linearized  about  X0,  equation  (39)  becomes 


HX  +  t)  —  Ra,  (46) 


where 


H4{H-Rb[xJ  -cAt0]},  (47) 

fj4Rf(G (48) 


Since  r)  is  independent  of  X0  the  updated  estimate 
that  makes  use  of  the  linearized  measurement  equation  (46) 
is  calculated  using  the  standard  least  squares  update 


X  =  Xo  +  P0Ht(HP0Ht  +  (^Rft)-1)  • 

(R0  -  Rb  (G-1)tG-1R0),  (49) 


where  Ra  is  the  measurement  corresponding  to  X0; 


(||S1-xo||2  +  cAi0)2-||S1||2 
(I|S2— xoI|2  +  cAt0)2-HS2||2 


(l|Sm  —  x0||2  +  cAf0)2  —  ||Sm||2 


(50) 


6.2  MAXIMUM  LIKELIHOOD  ESTIMATE  USING 
NONLINEAR  PART 


The  a  priori  probability  density  function  (pdf)  of  X  given 
by  the  linear  estimate  equations  (22)  and  (23)  is 

P(X)  =  (2tt)2|P0|1/2  ' 

exp{-l(X-Xo)TP0-1(X-X0)}.  (51) 

Our  objective  is  to  maximize  the  joint  probability  density 
function  of  X  and  the  measurement  Ra,  By  Bayes’  rule. 


this  joint  pdf  can  be  expressed  in  terms  of  the  conditional 
pdf 

p(X,  Ra)  =  p(Ra|X)p(X), 

The  conditional  pdf  we  require  is  determined  by  using  the 
nonlinear  measurement  equation  (39) 

p(R0|X)  =  • 

exp{-^(Ra  -  HX  -  R6XtQX)2},  (52) 

where 


optimum  via  a  Newton-Raphson  algorithm  are 

^/(X)  =  -  HX  -  RbXTQX)  • 

(H  +  2RbX'rQ)  +  <57) 

(X-XofPo1 
92  2  „  - 

0X2  /(X)  =  -^Rb(Ra  -  HX  -  RbXrQX)Q  + 
i(H  +  2RbXTQ)T(H  +  2RbXrQ) 
Po1- 

(58) 

6.3  CONDITIONAL  MEAN  ESTIMATE  USING 
NONLINEAR  PART 


<r2  4  E[rn]  =  (G^Rbf  E[mT)(G^Rb)  = 

(G-1Rbf(a2I)(G-1Rb)  (53) 

and 


'10  0  O' 
nA  0100 
^  0  0  10 
0  0  0  -1 


(54) 


Then 


In  the  previous  subsection,  the  joint  probability  density 
function  p(X,  Ra)  was  calculated.  The  density  function 
p(Ra)  is  given  by 


/OO 

p(X,Ra)dX, 

-OO 


and  by  Bayes*  rule, 


p(X|R0)  =  g-(-^Ra). 

F(Ra) 


Then  the  conditional  mean  is 


/OO 

Xp(X|Ra)dX.  (61) 

-OO 


This  conditional  mean  estimate  is  the  one  we  really 
want,  although  the  above  integrals  are  difficult  to  compute. 
Until  these  integrals  are  solved,  the  maximum  likelihood 
solution  from  the  last  section  remains  the  best  viable 
alternative. 


is  the  joint  pdf  that  we  must  maximize  with  respect  to 
X.  Clearly,  maximizing  p(X,  Ra)  above  is  equivalent  to 
minimizing  the  function 

/(X)  4  _L(Ra  _  HX  -  RbXTQX)2  + 

i(X-X0fPo1(X  — Xo).  (56) 

The  maximum  likelihood  estimate  can  thus  be 
obtained  by  minimizing  the  function  /(X)  expressed  in 
equation  (56).  Note  that  /  is  not  necessarily  convex  in 
X,  so  this  minimization  problem  may  not  be  an  easy 
one.  Since  /  is  a  smooth  function,  it  is  convex  if  and 
only  if  its*  Hessian  is  positive  definite  everywhere.  The 
relevant  quantities  to  determine  convexity  and  solve  for  the 


6.4  MINIMUM  VARIANCE  ESTIMATE  USING 
NONLINEAR  PART 

Let  us  describe  the  random  variable  X  as 

X  =  Xo  +  AX, 

where  Xq  is  the  a  priori  estimate  generated  using  the  linear 
part  of  the  measurement  equations  and  AX  is  a  zero-mean 
Gaussian  random  variable  with  covariance  P0.  Then  the 
nonlinear  measurement  equation  (39)  can  be  expressed  as 

H(X0  +  AX)  + 

Rb(Xo  +  AX)TQ(X0  +  AX)  +  Gfi^Ra,  (62) 
or  upon  rearrangement  as 

Hi  AX  4-  R,fc(AX)TQ(AX)  +  Gtj  =  Ri,  (63) 


where 


Table  1:  Comparison  of  errors  in  Monte  Carlo  simulations 


H1  =  H  +  2R6XfQ,  (64) 

Ri  =  Ra  -  HXo  -  Rbxj QXq.  (65) 

Given  this  measurement  equation  and  the  a  priori 
distribution  of  AX,  the  linear  minimum  variance  estimate 
of  AX  is 


AX  =  E[{AX)Rj}E[R1Rj'r1R1, 


(66) 


as  demonstrated  in  [9],  Since  we  have  already  shown  that 
AX  is  independent  of  the  measurement  noise  Grj,  the 
expectations  in  the  above  equation  are  readily  calculated: 

E[(  AXjRf]  =  P0Ht  (67) 

^[RiRf]  =  R|{3(P?i  +  p!2  +  P§3  +  Pl4)  + 
4P?2  +  2PnP22  +  4Pf3  + 
2PllP33  +4P23  +  2P22P33  — 
4Pf4  —  2P 11P  44  —  4P|4  — 

2P  22P  44  —  4P34  —  2P  33P44  }  + 

HP0Hr  +  GVGt, 


(68) 


where  Po  has  been  partitioned  as 


Po- 


Pil 

Pl2 

Pia 

Pw' 

P12 

P22 

P23 

p24 

Pl3 

P23 

P33 

P34 

P 14 

P24 

P34 

P44 

(69) 


7  EXPERIMENTAL  RESULTS 


The  techniques  developed  in  this  paper  were  tested  via 
Monte  Carlo  simulations.  These  simulations  each  used  the 
same  real  GPS  satellite  ephemeris  data  collected  beginning 
at  19:22:33.5  PST  on  Friday,  October  13th  2000  at 
(-2.5192459e+06  m,  -4.643 1270e+06  m  ,  3.5626325e+06 
m)  in  GPS  earth  centered  earth  fixed  (ECEF)  coordinates. 
Each  simulation  located  the  GPS  receiver  at  a  random 
position,  centered  at  (-2.51924596*14)6  m,  -4.643 1270e+06 
m  ,  3,5626325e+06  m)  in  GPS  ECEF  coordinates,  with 
standard  deviation  1000  km.  The  noiseless  pseudorange 
measurements  were  calculated,  then  corrupted  with  zero 
mean,  15  m  standard  deviation  Gaussian  noise.  Thus, 
a  sequence  of  artificial  measurements  where  the  true 
user  position  was  known  was  available  for  testing  our 
methods.  A  senes  of  50  simulations  was  performed,  each 
simulation  containing  1191  data  points.  The  results  of  these 
simulations  are  displayed  in  Table  1 . 

The  Monte  Carlo  simulations  we  used  were  fairly 
low  fidelity,  as  they  took  no  account  of  ionospheric 
or  tropospheric  noise.  To  check  the  validity  of  our 
results,  we  also  ran  a  simulation  on  an  Interstate 


GPS  solution  method 

mean  error 

error  std.  dev. 

IDS 

IDSBIS  2  steps 
linear  data  only 
project,  then  linearize 
project,  then  min,  var. 
project,  then  max.  lik. 

32,7163  m 
32.7143  m 
581.0040  m 
49.9955  m 
35.9738  m 
32,7148  m 

1.3870  m 
1.3869  m 
767.4165  m 
69.8273  m 
20.7965  m 
1.3866  m 

Table  2:  Comparison  of  errors  from  GPS  satellite 
constellation  simulator  simulation 


GPS  solution  method 

mean  error 

error  std.  dev. 

ros 

34.3087  m 

0.9417  m 

IDSBIS  2  steps 

34.3196  m 

0.9271  m 

linear  data  only 

50.4577  m 

25.2115  m 

project,  then  linearize 

32.5180  m 

2.0597  m 

project,  then  min.  var. 

34.3197  m 

0.9271  m 

project,  then  max.  lik. 

34.3195  m 

0.9271  m 

Electronics  Corporation  model  2400  GPS  satellite 
constellation  simulator,  collecting  measurements  with 
an  Ashtech  model  Z-12  GPS  receiver.  This  simulation 
followed  the  trajectory  of  an  aircraft,  initially  located  at 
(962850.28547m,  -5200816.32182m,  3563520.00371m) 
in  GPS  ECEF  coordinates,  starting  at  16:30:00  PST 
on  Monday,  January  22,  2001.  The  corresponding 
measurement  sequence,  which  consisted  of  1680 
measurement  epochs,  was  thus  corrupted  by  true  receiver 
noise,  as  well  as  a  good  approximation  of  the  tropospheric 
and  ionospheric  noises.  As  with  the  Gaussian  Monte 
Carlo  simulations,  the  true  position  of  the  antenna  was 
known,  allowing  precise  calculation  of  the  estimation 
errors.  The  results  of  several  solution  techniques  applied 
this  simulation  appear  in  Table  2. 

The  linear  method  alone  is  not  as  accurate  as 
other  methods,  because  all  of  the  information  in  the 
measurements  has  not  been  used.  When  the  methods  of  the 
last  section  are  used,  the  results  are  of  comparable  accuracy 
to  those  of  Biton,  ei  at  In  fact,  the  magnitude  of  the  errors 
of  the  maximum  likelihood  method  differ  from  those  of  the 
IDS  only  by  millimeters. 

The  IDS  scheme  may  be  implemented  in  a  recursive 
fashion,  which  is  called  the  “IDS -Based  Iterative  Solution” 
(IDSBIS)  in  [6].  To  our  surprise,  iterating  the  IDS  failed  to 
notably  improve  the  accuracy  of  the  solution  in  our  Monte 
Carlo  simulations,  in  contrast  to  the  results  reported  in  [6]. 
We  speculate  that  this  was  due  to  the  true  Gaussian  nature 
of  the  measurement  errors  in  these  simulations,  whereas 
the  real  GPS  measurements  used  by  Biton  et  al  were 
corrupted  by  non-Gaussian  noises. 


8  CONCLUSIONS 

This  paper  presents  a  direct  method  for  solving  the  GPS 
equations.  For  noiseless  pseudomeasurements,  the  user 
position  can  be  determined  by  solving  a  set  of  linear 
equations*  without  making  any  approximations.  If  the 
pseudomeasurements  are  noisy*  the  equations  are  still 
linear  in  the  unknown  position  and  clock  bias*  and  the 
nonlinearities  in  the  noise  terms  are  small  enough  to 
be  safely  ignored.  The  solutions  are  applicable  to  the 
differential  GPS  problem*  as  well  as  the  single  user  GPS 
problem. 

The  conversion  to  a  linear  problem  is  wasteful  in  an 
information  sense.  That  is,  some  of  the  measurement  data 
is  not  present  in  the  exact  linear  solution.  The  position 
estimate  thus  has  a  larger  associated  error  covariance  than 
that  associated  with  an  ILS  method  that  has  converged 
successfully.  Of  course,  one  cannot  tell  whether  an 
ILS  solution  has  converged  to  the  correct  answer*  so  a 
tradeoff  has  been  made  between  certainty  of  convergence 
versus  precision  of  the  estimate.  We  have  presented 
several  methods  for  improving  the  linear  estimate  by  using 
the  information  not  present  in  the  linear  measurement 
equations.  These  techniques  yield  results  on  par  with  the 
ad  hoc  procedure  developed  by  Biton  et  at ,  while  having 
a  more  sound  theoretical  basis  and  better  understood  error 
bounds  and  convergence  guarantees. 
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A  residual-based  scheme  is  presented  for  solving  the  radar  track-to-track  association  problem  using  bearings- 
only  measurements.  To  accomplish  track  association  between  two  stations,  the  residuals  of  a  bank  of  nonlinear 
filters  called  modified  gain  extended  Kalman  filters  are  analyzed.  Once  tracks  have  been  associated  between  two 
stations,  tracks  from  additional  stations  may  be  associated  with  trades  from  the  first  two  stations  by  checking 
algebraic  parity  equations.  Traditional  track  assodation  methods  rely  on  the  local  stations’  estimated  target 
positionsand error  variances.  These  local  estimates  may  be  quite  inaccurate  or  even  divergent  when  using  bearings- 
only  measurements.  Our  method  bypasses  this  difficulty  because  our  filters  use  raw  data  from  multiple  stations. 
An  example  demonstrates  that  our  methods  yield  results  superior  to  those  of  standard  methods. 


L  Introduction 

SUPPOSE  that  several  spatially  distributed  radar  installations 
are  each  tracking  several  targets.  Associating  a  given  target  to 
its  track  at  each  of  the  radar  stations  is  an  important  issue,  which 
the  radar  literature  refers  to  as  the  track-to-track  association  prob¬ 
lem,  Suppose  further  that  the  stations  use  passive  sensors  that  only 
measure  bearings  to  the  target,  without  measuring  range.  In  this  pa¬ 
per,  we  outline  a  strategy  for  solving  this  association  problem  by 
analyzing  measurement  residual  s , 

Bearings-only  observation  functions  fall  into  two  special  classes 
of  nonlinear  functions,  called  modifiable  and  approximately  modi¬ 
fiable  nonlinearities,  which  are  defined  as  follows: 

Definition  L  A  time-varyingfunetion  / :  W  W  is  called  mod¬ 
ifiable  if  there  exists  an  operator  such  that  for 

anyx,x€Ks, 

f(x)  -  f(x)  =  A[f  (*),  x](x  -  x)  (1) 

Definition  2.  A  time-varyingfunction  / :  M”  E4  is  called  ap¬ 
proximately  modifiable  if  there  exists  a  region  DC  E"  and  operators 
A:W  xlB^-  W*n  and£  ;En  x  Kn  ->E*XB  such  that  for  any  xt 
x€D, 

f(x)  -  f(x )  =  [A(f  (*),  x )  +  S{x ,  x  -  x)](x  -  x)  (2) 

where  lim, *_*,_»<>  ||£(x,  x  -i)||/I|A(/(x),x)||  =0, 

Song  and  Speyer’s  modified  gain  extended  Kalman  filter 
(MGEKF)1  is  a  globally  convergent,  unbiased,  nonlinear  observer 
for  systems  whose  measurement  functions  are  modifiable  or  ap¬ 
proximately  modifiable.  In  this  paper,  the  observers  we  design  for 
bearings-only  track  association  are  MGEKFs. 

An  early  attempt  at  solving  the  track-to-trackassociationproblem 
was  made  by  Singer  and  Kanyuck,2  In  their  paper,  they  incorrectly 
assumed  that  estimation  errors  local  to  each  station  were  uncorre¬ 
lated.  Bar-Shalom,3  Bar-ShaJom  and  Fortmann,4  and  Bar-Shalom 
and  Campo5  later  corrected  this  error  by  accounting  for  the  correla¬ 
tion  between  the  local  estimation  errors  due  to  the  common  process 
noise  of  the  target.  Later  researchers  have  integrated  the  problem 
of  track  association  directly  into  the  process  of  separatingthe  mea¬ 
surements  corresponding  to  actual  targets  from  clutter.6,7  In  all  of 
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these  references,  it  is  assumed  that  both  range  and  bearings  were 
measured.  In  some  of  these  references,  the  possibility  of  using  a 
MGEKF  to  handle  the  situation  of  bearings-only  measurements  is 
mentioned,  but  none  have  a  discussion  of  the  details  of  such  an  im¬ 
plementation,  in  particular  problems  associated  with  the  asymmetry 
of  single  station  estimation  errors.  Estimates  based  on  bearings-only 
measurements  from  a  single  station  are  especially  uncertain  along 
the  line  between  the  target  and  the  receiver.  This  uncertainty  is  re¬ 
duced  when  measurements  from  physically  separated  stations  are 
used.  Our  method  attempts  to  take  advantage  of  thisphenomenonby 
using  estimates  constructed  from  several  stations’  measurements. 

The  paper  is  organized  as  follows.  We  show  in  Sec.  II  that 
bearings-only  measurement  functions  are  modifiable.  {Prior  results 
only  showed  that  they  were  approximately  modifiable.1)  We  then 
demonstrate  in  Sec.  Ill  that  incorrect  associations  between  two 
radar  stations  can  be  interpreted  as  sensor  faults,  so  that  a  bank 
of  modified-gain  fault  detection  filters  can  be  used  to  determine  the 
track  associations.  Section  IV  contains  the  main  result,  an  algo¬ 
rithm  for  solving  the  bearings-only  track  association  problem.  The 
application  of  this  algorithm  to  an  example  in  Sec.  V  compares  our 
approach  to  a  conventional  track  association  method.  Section  VI 
concludes  the  paper. 

In  the  sequel,  inertial  Cartesian  coordinates  describe  the  motion 
of  each  target  in  three  dimensions  via  the  state  vector 

=  Y*  Z *  r  Y*  Z*  X*  r  Tf  (3) 

and  the  dynamics  of  each  target  are  assumed  to  be  of  the  form 

x*(k  + 1}  =  A(k)x*(k)  +  Bikjw^k)  (4) 

Note  that  we  include  an  acceleration  state  to  model  maneuvering 
target  dynamics. 


II.  Modifiability  of  Beaiings-Only  Measurements 

Song  and  Speyer1  showed  that  the  azimuth  angle  az\  e 
[— tt/2,  nf 2)  and  the  elevation  angle  eVs  e  [— tt/2,  n/2)  from  station 
s  to  target  i,  as  shown  in  Fig.  1,  are  modifiable  and  approximately 
modifiable,  respectively.  The  region  D  in  which  the  elevation  angle 
was  approximately  modifiable  excluded  an  ellipsoidal  region  near 
the  sensor,  making  their  algorithms  difficult  to  implement  for  situa¬ 
tions  where  the  angular  sensor  gets  close  to  the  target,  for  example, 
in  the  terminal  guidance  of  a  missile.  We  improve  this  situation 
somewhat  by  introducing  the  new  angle  4^  €  [— ?r/2,  n/2)  and  de¬ 
scribing  the  position  of  the  target  in  terms  of  4^  and  ==  az\. 
Note  that  4^  can  be  calculated  from  az*s  and  eVs  via  the  equation 


=  tan" 


■(f)- 


tan  el't  \ 
\cos  az\  J 


(5) 


This  section  is  devoted  to  proving  that  the  measurement  function 
for  4*j  is  modifiable. 

Let  x*  be  an  estimate  of  x(  and  assume  that  the  position  of  the 
measurement  station  in  inertial  space,  xs  —  [Xs  Ys  Zs]t  is  known. 
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Radar  Station  (s) 


Target  (t) 


Fig.  1  Angles  for  target  bearings. 

Then  X*,  Y*t  Zj,  X*t  fj9  and  Z\  can  be  computed  by  taking  the 
difference  between  elements  of  xft  x\  and  xs . 

Suppose  that  station  s  measures  the  bearings  of  target  t  with  the 
measurement  vector  z*.  Define  hs(x*)  by 


The  measurement  residual  corresponding  to  hs(x!)  is  then 
hs(x')  -  hs(x‘) 

=  '  tan"1  (Yjf  Xj)  -  tan’1  (?/  /  X,')  1  ^  rtan-«a- 
_tan_1(Z'/X')  -  tan"* (2' / X')  J  Ltan_,£ . 

Applying  the  trigonometric  identity 

tan" '(a)  -  tarr'OO  =  tair*[(«  -  b)/{\+ab)\ 


we  obtain 


I"  tan~'a]  _ 


i  +  (y,*/x{)(P//x{) 
(z;/x;)  -  (z;/xQ 
i  +  (z</xO(z;/x;) 


tan"1  a* 
tan"1 


,/y;x;-y/x;  V 
Vxjxj  +  z.'f/y 


M(4)  4 

r  sin(*o 


0  0  0  0  0  0  0 

— eos('Pj)  0  0  0  0  0  0 


Let  rf,  =  V[(^)2  +  0’;)2],  Di^rf./tX'X'  +  Zjf/],  <i24 
>/[(X')2  +  (Z')2],  and  D24d2/[x;X'  +  Z'Z'].  Note  also  that 
sin(4>')  =  yt'/dj,  cos(4 >')  =  X'/d,,  sinWs)  =  Z'/d2,  and 


cos(^)  =  Xydi.  Therefore,  we  can  express  Z>i  and  D2  as  func¬ 
tions  of  the  estimates  and  measured  angles: 

z>,  =  Di(zi,x‘)  =  i/[cos(<i>;)x;  +Sin(<t>')y;] 

d2  =  Pafe,*')  =  i/[cos(>;)x;  +sin(^;)z;] 

If  we  express  the  trigonometric  fimctions  in  H(z!s),  Du  and  D2 
in  terms  of  X',  F/,  Z*,  X*,  FJ,  and  Z',  we  can  write  Eq,  (9)  as  a 
function  of  zt  andxf: 


j(z’s,x OJ  L  0 


Hfe) [*'-**]  <H> 


Finally,  we  can  rewrite  Eq,  ( 1 1 )  as 

r-iin/.fe,?)  o  "  ”0,(2', ic')  0 

L_1J  L  0  L  0  d2(Z’„x’)_ 

x  h{z\)  [x'  -x‘  -x'+x,]  (12) 

and  combine  it  with  Eq .  (7)  to  obtain  hs  (xr)  in  modifiable  form, 


l>i(z*,xf)tan  !a(z*,sr) 


P2(z;,x‘)tan  'fffe.x*) 


xH(z')[x'  —  j?] 

where  we  have  made  use  of  the  identity 


H{z\)[xs  -x']  =  ® 


Thus,  we  have  replaced  theelevationangleel*,  from  which  Song  and 
Speyer1  produced  an  approximately  modifiable  function  with  a  new 
angle  Like  the  azimuth  angle  d>* ,  angle  ^  leads  to  modifiable 
measurement  functions . 

HI.  Converting  Incorrect  Associations 
into  Sensor  Faults 

Suppose  that  station  s  can  view  several  targets,  indexed  by  i ,  and 
measures  the  bearings  of  each  target.  Then  each  of  these  measure¬ 
ments  zj  is  generated  by  hs{x*)9  as  in  Eq.  (6).  Now  suppose  that 
another  station,  using  its  local  observations,  generates  a  state  es¬ 
timate  of  one  of  the  targets  that  station  5  views.  This  estimate  xj 
corresponds  to  xK  the  true  state  of  the  jth  target  at  station  5,  but 
neither  station  knows  the  value  of  index  j.  Our  goal  is  to  determine 
which  of  the  tracks  at  station  s  is  the  jth  one,  using  only  {z1 1,  the 
measurements  local  to  station  sf  and  xJ9  the  other  station’s  state 
estimate  of  one  of  the  targets. 

To  this  end,  let  us  form  the  following  error  residual  between  the 
estimate  and  the  measurement z* ,  making  use  of  the  result  from 
the  preceding  section: 

z's  -  h,(x>)  =  MX)  -  hs(x‘)  -  G(z‘s,xJ)(x‘  - xJ )  (14) 

where  from  Eq.  (13) 

G(z*„F)  = 

Pi  (4 ,  xJ)  tan"1  a  ,  xJ) 

*(zj,F)  ° 

Q  P2(z;,^)tan-|^(zfi,f>) 


(15) 
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By  introducing  a  zero  term  into  the  measurement  residual,  we  can 
rephrase  it  as 

4  -  hs(xJ)  =  hs(x‘)  -  h,(xJ)  (16) 

4  -  hs(xJ)  =  G(<,i>)(xf  -in  (17) 

z‘s  -  hs(xj)  =  G(zis,xJ)(xi  -xj  +x>  -xJ)  (18) 

z'  -  h,&)  =  G(zJ, xJ)(xi  - xn  +  G(z;,xJ)(x1  - xO  (19) 

zj  —hs(xJ)  =  G(z‘s,xj)(x‘  —xi)  +  n‘J  (20) 

where  fij  =  G(z!sfxj)(x‘  —  xJ)  represents  the  difference  between 
xi  and  xJ  as  a  sensor  fault  If  i  =  j,  we  have  correctly  guessed  the 
associationbetween  measurement  and  estimate,  and  there  is  no  fault 
(jjff  =  0),  If  i  #  j ,  then  fij  #  0,  playing  the  role  of  a  sensor  fault  in 
the  residual, 

IV.  Algorithm  for  Track  Association 
from  Bearings-Only  Measurements  via  Fault 
Detection  Filters 

Suppose  that  there  are  S  radar  stations,  with  known  inertial  coor¬ 
dinates,  that  make  bear ings-only  measurements  in  three-space  of  T 
different  targets.  We  assume  that  all  measurements  at  each  station 
have  been  grouped  as  tracks  of  each  target  visible  at  that  station 
using  conventional  means.4,8  9  In  this  section,  we  propose  an  algo- 
f  rithm  for  associating  the  tracks  at  all  stations  to  their  corresponding 
targets. 

Assume  that  each  measurement  station  s  is  located  at  known 
inertial  coordinates  (XSf  Ys,  Zs).  Let  xli  denote  a  fault  detection 
filter’s  estimate  of  the  target  corresponding  to  the  ith  track  at  the 
first  station.  The  bearings-onlymeasurementfunctionfor  the  station 
r  of  the  same  target  is  thus 


From  the  results  of  the  preceding  section,  the  error  residual  of  track 
j  at  any  station  sf  generated  by  target  i  at  station  1,  is  given  by 


z{  -hs(xy)  &  G(zt,xu)(xu  —xw)  +  $a*J  +  vJ  (21) 
where  G(4 ,  Jr1* )  is  given  by  Eq.  15  and  the  sensor  noise  is 
vJ  =  Af  (0,  Vj) 

Tlie  approximatestructureof  Eq.  (21)  is  due  to  thereplacementof  the 
measurement  function  in  G(*,  *)  with  the  actual  measurement  (see 
Song  and  Speyer1).  Note  that,  by  default,  fxy  =  0,  Vi  —  1, . , . ,  T. 

The  following  algorithm,  illustrated  in  Fig.  2,  associates  tracks 
between  stations. 

Algorithm  (track  association): 

1)  Let  *  =  1. 

2)  Run  a  bank  of  T  detection  filters  that  operate  on  data  from 
stations  1  and  2,  where  the  jt h  filter  attempts  to  detect  /x1/.  Each 
filter  is  constructed  using  the  dynamic  detection  filter  procedure 
given  next.  All  but  one  of  these  detection  filters  should  register  a 
fault.  The  track  corresponding  to  the  filter  that  detected  no  fault  is 
associated  with  z\ .  Without  loss  of  generality,  label  this  track  4* 

3)  For  each  track  zls>*  s  =  3, . , . ,  Sf  l  =  1, . , , ,  r,  perform  the  al¬ 
gebraic  parity  test  given  subsequently.  If  the  result  of  the  parity  test 
is  zero,  then  4  is  associated  with  tx  and  iv 

4)  If  i  <  7\  increment  i  by  1  and  go  to  step  2,  If  i  =  7%  we  have 
completed  the  track  association  procedure. 

Note  dial  estimates  obtainedin  step  2  are  used  in  step  3.  Therefore, 
stations  1  and  2  should  be  chosen  to  maximize  observability  of  the 
targets. 

Dynamic  Detection  Filter 

For  any  estimator  of  xli,  die  estimation  residual  determined  by 
the  measurements  z\  and  4  will  not  converge  to  values  near  zero 
unless4  and  4  correspondto  the  same  target.  One  such  estimator  is 
the  MGEKF1  given  as 

xli(k  + 1)  =  A(k)xy(k)  (22) 

i  (23) 

z{(k)-h2[ium  j 

xu(k)  =  x11  (k)  +  KIJ  (kyj  (Jt)  (24) 

M»(k  +  1)  =  A(k)P<j(k)AT(k)  +  Q(k )  (25) 


\ 


residual  (radians)  ,  residual  (radians)  ,  residual  (radians) 
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Fig.  5  Filtered  MGEKF  residual  for  matching  tracks. 

Fig.  8  Parity  test  residual  for  mismatched  trades. 


covariance  10  \  and  measurement  noise  with  covariance  1).  These  the  mismatch  are  much  larger  than  those  correspondingto  a  correct 
estimates  (Figs.  5  and  6)  clearly  show  that  the  mean  corresponding  association. 

to  a  mismatch  looks  nothing  like  that  of  the  matched  case.  For  purposes  of  comparison,  Fig.  9  plots  the  error  statistic  de- 

After  the  tracks  had  been  associated  between  the  first  two  sta-  veloped  by  Bar-Shalonr  and  Bar-Shalom  and  Fortmann4  for  both 

lions,  algebraic  parity  teste  attempted  to  associate  the  targets  ob-  a  correct  and  an  incorrect  track  association  (using  the  same  data 

served  by  the  third  station  relative  to  those  observed  by  the  first  sequences  that  were  used  by  the  filters  in  Figs.  3  and  4),  Note 

and  second  stations.  Two  plots  of  residuals  generated  by  the  alge-  that  the  chi-squared  error  statistic  does  not  change  much  between 

braic  parity  tests  appear  in  Figs.  7  and  8.  Again,  the  residuals  for  die  matched  and  mismatched  cases.  We  also  noticed  that  there 
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Thble  1  Radar  station  positions 
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x  Position, 
m 

y  Position, 
m 

z  Position, 
m 
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-400 

100 
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from  station  l 


Tracks  from 
station  s 


Bank  of 
algebraic 
parity  tests 


Track  from  “ 

station  2  that 

. . . .  best  matches  *— — — — — 

track  i,  station! 

Fig.  2  Track  association  procedure. 
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station  s  that 
best  matches 
track  i,  stationl 


K‘Uk)  =  MHWlull)[htU(k)M'Uk)hTllw  +  V"  (*)]"'  (27) 

-r-  ,  ,  i  G{z\(k),xu(kj) 

G[ti(k),zi2(k),xh(k)]=  )  :  (28) 

_G(zi(k),x'-(k))_ 

P'Hk)  =  { /  -  KiJ (k)G[^(k),zi(kS,xu (k)] |M'-' (k) 
x  {/ - K>(]c)G[^(k)t zi(k),x'‘(k)]}T 

+KiJ(k)(ViJ)-l(k)(KIJ)T(k)  (29) 

where 

V^(&)  =  diag{V%  VJ]  (30) 

The  weighted  innovations  process  of  the  MGEKF, 

i/j(k)  =  + v«<»]  W>  (3i) 

should  be  close  to  a  zero-mean,  unit  variance  white  noise  sequence 
only  if  z\  and  z{  correspond  to  the  same  target. 

Algebraic  Parity  Test 

This  test  determines  if  4,  S  >  s  >  2,.  T  >  /  >  1 ,  is  associated  with 
z\  and^2»  where  Zj  andz^  are  already  known  to  be  associated  with 
each  other.  Suppose  that  Jr1'  is  the  state  estimate  generated  by  z!i  and 
iv  Then,  if  zj  is  associated  with  the  tracks  z|  and  4, 

v{k)\l  4  [hxumMiJ(k)hTxl,m  +  Vi^k)f{zfs(.k)  -  *,Exw(*)]} 

(32) 

should  be  close  to  a  zero  mean,  unit  variance  white  noise  sequence. 
Here,  the  approximate  measurement  matrix  hxu(k)  is  computed  in  a 
manner  similar  to  the  first  two  rows  of  the  matrix  in  Eq.  (26),  but 
referenced  to  (Xst  Ys,  Zs),  the  location  of  station  st  instead  of  the 
location  of  the  first  station  (X{,  Yu  Zx ),  The  algebraic  parity  test  is 
simply  to  evaluate  the  parity  equation  (32). 

V.  Example 

The  track  association  algorithm  presented  in  the  last  section  is 
applied  to  simulation  data  in  this  section.  Three  radar  installations 
were  located  at  the  positions  given  by  Table  1,  and  two  targets 
were  both  modeledas  ninth-orderlineartime-invariantdiscrete-time 
systems  with  the  dynamics 

x(t)=Fx(t)  +  Tw(t)  (33) 


“0  00100  0  0  0" 

0000100  0  0 

0000010  0  0 

0000001  0  0 

0000000  1  0 

0000000  0  1 

000000  -a  0  0 

000000  0  -a  0 

.0  0  0  0  0  0  0  0  -a. 

“0  0  0” 

0  0  0 
0  0  0 
0  0  0 

r4  o  0  0  (34) 

0  0  0 
1  0  0 
0  1  0 
.0  0  1. 

and  where  w  is  a  zero  mean  Brownian  motion  process  with  covari¬ 
ance  /a  x  3  and  a  =  ^  is  the  time  constant  for  the  first-order  filters 
that  model  target  maneuvers  as  colored  noise  processes.  We  sample 
this  model  at  intervals  of  T  =0.1  s  to  generate  the  discrete  time 
dynamics 

x(k  + 1)  =  Ax(k)  +  Bw(k)  (35) 


n=r 

j  0 


er,Bdt,  £[*(*)]  =03*, 


E[w(k)wT  (/)]  =I3x  j4,  (36) 

The  targets  began  the  simulation  with  the  initial  conditions 

*i(O)  =  [50  220,000  30,000  250  -1000  0  0  0  Of 

x2(0)  =  [50,000  20,000  35,000  -250  1000  0  0  0  Of 

This  configuration  corresponds  to  the  two  targets  initially  moving 
directly  to  ward  each  other,  in  a  linethatalmostpassesthrough  station 
2.  In  the  simulation,  they  pass  closest  to  each  other  at  t  =99,2  s. 
Each  measurement  station  measures  the  angles  and  4^  to  each 
target  at  every  sample  time.  These  measurements  are  subject  to 
additive,  normally  distributed  zero-mean  white  measurement  noise 
with  standard  deviation  1  deg.  We  assume  that  the  measurement 
noise  is  independent  between  sensors  at  all  stations.  Each  MGEKF 
begins  with  the  a  priori  information 

xum  =  [25,000  120,000  32,500  0  0  0  0  0  Of 

J>"(0)=107x/9x» 

Finally,  we  assume  that  the  local  stations  were  able  to  separate  their 
measurements  from  clutter  perfectly  using  methods  like  those  of 
Reid9  or  Bar-Shalom  and  Fortmann,4  or  Fortmann  and  Bar-Shalom.8 

Figure  3  plots  the  weighted  innovations  of  a  MGEKF  that  uses 
measurements  from  stations  1  and  2  that  correspond  to  the  second 
target,  whereas  Fig,  4  plots  the  weighted  innovations  of  a  MGEKF 
that  uses  measurements  that  are  mismatched.  Note  that  the  inno¬ 
vations  for  the  correct  match  appear  to  be  a  zero  mean  white  noise 
sequence,  whereas  the  innovationsfor  the  incorrect  match  are  larger 
and  are  not  white.  To  better  observe  the  behavior  of  these  sequences, 
their  means  were  estimatedusinga  Kalman  filter  (assuming  that  each 
element  of  the  weighted  innovation  of  the  MGEKF  was  a  measure¬ 
ment  of  a  process  that  had  integrator  dynamics,  process  noise  with 
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Fig,  9  Error  statistic  suggested  by  Bar-Shalom3  and  Bar-Shalom  and 
Fortmann4:  (xi  -x2)tE[(xi  -Jc2X*i  -hflftt  ~x2l 
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Euclidean  norm  of  error  In  tracking  target  1. 

were  several  instances  where  nearly  singular  matrices  were  inverted 
in  the  algorithm  that  computes  the  covariance  of  the  difference 
between  two  local  estimates. 

Part  of  the  reason  for  this  difficulty  is  explained  in  Fig,  10,  a 
plot  of  die  Euclidean  norm  of  the  estimation  error.  The  solid  line 
corresponds  to  a  MGEKF  that  uses  measurements  from  both  station 
1  and  2,  whereas  the  dotted  line  is  from  a  filter  that  only  used  station 
1  measurements.  Any  method  that  relies  on  estimates  that  only  use 
a  single  station’s  measurements  is  subject  to  a  large  error.  This  is 
not  a  huge  concern  for  linear  estimators,  but  the  matrix  PiJ  defined 
by  Eq,  (29)  may  not  necessarily  reflect  this  error. 


We  have  also  encountered  cases  where  a  single  station  measure¬ 
ment  MGEKF  was  divergent  in  the  radial  direction  to  the  target,  but 
no  such  difficulties  have  appeared  when  data  from  two  geograph¬ 
ically  disparate  stations  was  used.  One  way  of  generating  such  a 
divergent  case  was  to  decrease  the  maneuver  colored  noise  auto¬ 
correlation  parameter  a  to  ^  or  below.  We  note  that  values  of  this 
parameter  below  ^  correspond  to  slower  maneuvers,  a  commonly 
encountered  situation. 


vi.  conclusions 


This  paper  describes  residual-based  techniques  for  solving  the 
radar  track  association  problem  for  bearings-only  measurements. 
The  association  between  the  tracks  at  two  stations  can  be  deter¬ 
mined  by  examining  the  residuals  of  a  bank  of  MGEKFs,  Once  this 
association  is  established,  an  algebraic  parity  test  can  find  the  cor¬ 
respondence  between  tracks  at  other  stations  and  targets  tracked  by 
the  first  two  stations. 

One  may  ask  why  detection  filters  are  necessary:  Why  not  do 
everything  with  algebraic  parity  tests?  Although  the  detection  fil¬ 
tering  step  is  not  strictly  necessary,  it  does  improve  the  quality  of 
the  track  associations  because  the  state  estimates  constructed  from 
two  widely  separated  stations  are  so  much  more  accurate  than  the 
estimates  from  a  single  station. 

To  ensure  the  quality  of  the  estimates  from  the  MGEKFs,  one 
could  delay  the  algebraic  parity  testing  steps  for  associating  tracks 
from  additional  stations.  If  these  parity  tests  are  replaced  with  ad¬ 
ditional  detection  filter  banks  until  the  estimates  before  and  after 
including  a  new  station’s  measurements  are  sufficiently  close,  then 
the  fidelity  of  the  estimates  can  be  guaranteed. 
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ABSTRACT 

In  this  paper,  we  present  two  statistical  techniques 
appropriate  for  the  ’’validation”  of  integer  ambiguities  and 
the  detection  of  cycle  slips.  The  multiple  hypothesis 
Wald  sequential  probability  ratio  test  (SPRT)  can  find  the 
conditional  probability  that  each  set  of  integer  biases  under 
consideration  is  the  true  bias  condition.  The  multiple 
hypothesis  Shiryayev  SPRT  determines  the  conditional 
probability  that  the  integer  biases  have  jumped  from  the 
nominal  bias  condition  to  each  member  of  a  collection 
of  other  bias  conditions.  Hence,  the  Wald  SPRT  is  a 
method  for  validating  the  integer  ambiguities  during  the 
initial  ambiguity  resolution,  while  the  Shiryayev  SPRT  can 
be  used  to  monitor  for  cycle  slips. 

Each  of  these  multiple  hypothesis  SPRTs  (MHSPRTs) 
makes  use  of  two  measurement  residuals.  One  is  geometric 
combination  of  the  carrier  phase  measurements,  and 
the  other  is  generated  by  differencing  the  carrier  phase 
measurements  with  code  measurements. 

Prior  work  on  cycle  slip  monitoring  has  focused  solely 


on  the  detection  of  the  occurrence  of  a  cycle  slip  in  the 
fastest  time,  balanced  against  the  probability  of  issuing  a 
false  alarm.  Once  a  disruption  has  occurred,  the  ambiguity 
resolution  process  must  restart  from  scratch.  The  Shiryayev 
SPRT  bypasses  this  problem,  as  it  announces  the  location 
of  the  biases  after  the  jump,  in  addition  to  the  time  of  the 
cycle  slip. 

The  calculations  for  the  MHSPRTs  are  not  linked 
to  any  particular  distribution,  unlike  prior  efforts.  Only 
the  probability  density  functions  of  the  measurement 
residuals  are  required.  Hence,  the  techniques  can  correctly 
compensate  for  non-Gaussian  errors  in  measurement  such 
as  multipath. 

For  each  hypothesis  under  consideration,  the  MH¬ 
SPRTs  yield  the  probability  of  that  hypothesis  being  the 
correct  one.  The  "state”  of  the  MHSPRT  recursions  is  the 
vector  of  all  of  these  probabilities.  Information  from  past 
measurements  is  embedded  in  this  state.  This  recursive, 
probabilistic  framework  makes  it  very  straightforward  to 
add  new  hypotheses  into  the  set  of  possible  bias  conditions 
while  retaining  information  from  prior  measurements. 

Results  from  successful  simulations  and  field  ex¬ 
periments  are  presented,  showing  the  efficacy  of  our 
techniques. 

1  INTRODUCTION 

This  paper  describes  new  techniques  for  resolving  a  par¬ 
ticular  problem  inherent  in  determining  relative  positions 
using  the  Global  Positioning  System  (GPS).  GPS  was 
originally  designed  to  determine  the  positions  of  antennae 
relative  to  the  Earth,  but  when  one  is  interested  only  in 
the  positions  of  two  antennae  relative  to  each  other,  more 
precise  “differential  GPS”  (DGPS)  methods  may  be  used. 
The  most  precise  DGPS  method  is  carrier  phase  DGPS, 
which  measures  the  difference  in  the  phase  bf  the  GPS 
carrier  signal  between  two  receivers.  To  create  a  relative 
position  estimate  using  carrier  phase  GPS,  the  unknown 
number  of  full  cycles  of  the  carrier  signal  (the  “integer 


ambiguity”)  between  the  GPS  receivers  must  be  found  and 
added  to  the  differential  phase.  Standard  least-squares 
estimation  techniques  generate  floating  point  estimates  of 
the  integer  ambiguity  that  can  narrow  the  space  of  cycle 
numbers  that  must  be  searched  in  order  to  calculate  ranges 
accurately.  The  small  number  of  unknown  cycles  that 
could  correspond  to  die  error  in  the  floating  point  estimate 
comprise  the  set  of  biases  from  which  the  integer  ambiguity 
resolution  algorithm  must  choose.  This  paper  proposes 
several  new  algorithms  for  integer  ambiguity  resolution 
and  cycle  slip  detection  based  on  two  statistical  tests  - 
the  multiple  hypothesis  Wald  probability  ratio  test  and  the 
multiple  hypothesis  Shiryayev  probability  ratio  test. 

There  are  several  methods  currently  in  use  for 
resolving  the  integers,  of  which  Teunissen’s  least-squares 
ambiguity  decorrelation  adjustment  (LAMBDA)  method 
[1,  2]  is  the  most  popular.  The  other  commonly  used 
integer  resolution  method  is  the  ambiguity  function  method 
of  Counselman  and  Gourevitch,  and  its’  variants  [3,  4,  5]. 
Both  of  these  methods  generate  estimates  of  the  differential 
position  in  the  process  of  determining  the  integers.  In 
contrast,  the  method  recently  developed  by  Park,  et  al  [6, 
7]  eliminates  the  differential  state  from  the  residual  they 
use  to  determine  the  integers.  We  will  use  this  residual  in 
our  development,  since  it’s  independence  in  time  makes  it 
valuable  for  statistical  tests. 

All  of  these  prior  methods  use  either  Chi-squared  tests 
or  F-tests,  so  they  all  can  potentially  benefit  from  the  more 
sophisticated  recursive  statistical  methods  described  in  this 
paper. 

Mertikas  and  Rizos  [8]  have  developed  a  scheme  for 
detecting  cycle  slips  in  carrier  phase  GPS  measurements. 
In  their  paper,  they  apply  the  CUSUM  test  [9,  10]  to 
die  residual  of  a  Kalman  filter  in  order  to  detect  cycle 
slips.  In  order  for  this  scheme  to  work,  they  must  assume 
a  dynamical  structure  for  the  integers  that  is  somewhat 
artificial.  Also,  while  their  method  will  announce  when  a 
cycle  skip  has  occurred,  it  cannot  determine  the  position  of 
the  new  integer  bias.  Once  a  cycle  slip  has  been  detected  by 
their  methods,  a  new  integer  ambiguity  search  must  begin. 

In  contrast,  the  cycle  slip  detection  methods  we 
propose  in  the  next  sections  explicitly  announce  the  new 
integer  ambiguities,  as  well  as  the  time  of  the  cycle  slip. 
The  only  assumption  about  cycle  slips  that  we  make  is  an  a 
priori  probability  of  a  slip,  which  is  far  less  of  a  leap  than 
constructing  dynamics  for  the  disruption.  The  statistical 
tests  we  use  are  also  more  computationally  efficient  than 
CUSUM,  As  with  our  methods  for  determining  the  initial 
integer  ambiguities,  the  residual  we  analyze  requires 
no  estimation  of  the  relative  position  between  the  GPS 
antennae. 

The  paper  is  summarized  as  follows:  The  Shiryayev 
and  Wald  multiple  hypothesis  sequential  probability  ratio 
tests  are  derived  in  Section  2.  The  improved  integer 
resolution  algorithm  is  presented  in  Section  3.  Section  4 


presents  a  residual  that  enables  the  designer  to  increase 
computational  efficiency  in  exchange  for  longer  sampling 
periods,  A  similar  residual,  presented  in  Section  5, 
allows  the  integer  ambiguity  associated  with  a  newly 
acquired  satellite  to  be  rapidly  determined  when  the  other 
integer  ambiguities  are  already  known.  Section  6  contains 
experimental  results  using  data  from  simulations  and  from 
actual  GPS  measurements.  The  paper  concludes  with 
Section  7. 


2  MULTIPLE  HYPOTHESIS  SEQUENTIAL  PROB¬ 
ABILITY  RATIO  TEST 


In  this  section  we  present  two  sophisticated  statistical 
tests  for  determining  the  most  likely  event  from  a  set  of 
hypotheses.  The  multiple  hypothesis  Shiryayev  sequential 
probability  ratio  test  (MHSSPRT)  detects  jumps  from  a 
base  hypothesis  to  another  hypothesis  in  the  set.  The 
multiple  hypothesis  Wald  sequential  probability  ratio  test 
(MHWSPRT),  which  is  a  special  case  of  the  MHSSPRT, 
determines  the  most  likely  event  from  a  set  of  hypotheses, 
assuming  that  the  event  is  true  for  all  time. 

The  MHSSPRT  and  MHWSPRT  can  be  applied 
in  place  of  the  Chi-squared  test  in  [6,  7].  The 
MHWSPRT  yields  somewhat  better  convergence  times 
than  Chi-squared,  and  the  MHSSPRT  allows  one  to 
monitor  for  cycle  slips.  However,  the  best  improvement 
comes  when  these  tests  are  applied  to  the  enlarged  residual 
presented  in  the  next  section. 

The  material  in  this  section  is  adapted  from  the  work 
of  Malladi  and  Speyer  [1 1]. 


2.1  RECURSIVE  RELATION  FOR  SHIRYAYEV 
SEQUENTIAL  PROBABILITY  RATIO  TEST 


Suppose  that  we  have  a  set  of  different  hypotheses 
. . .  Hm}.  We  wish  to  know  if  there  is  a 
transition  from  the  base  hypothesis  Hq  to  any  of  the  other 
hypotheses,  and  the  time  that  the  transition  occurs.  We 
will  derive  a  recursive  formula  that  at  each  time  step 
computes  the  probability  that  a  transition  has  occurred  to 
each  hypothesis,  given  the  measurement  residual  sequence 
up  to  that  time. 

Let  us  define  the  following  notation  in  this  section: 


r  (k)  Measurement  residual  vector  at  time  k. 

R (k)  Measurement  residual  history  up  to  time  k, 

Oi  Time  of  transition  to  hypothesis  Ht. 

Si(k)  Event  <  k  +  1}. 

Fi{k)  P($i  <  k\R(k)). 

n  Ph  <  o). 

pi  A  priori  probability  of  transition  to 

hypothesis  Hi  from  time  k  to  k  + 1 , 
/*(*)  Probability  density  function  of  r  given  H^ 

fo(0  Probability  density  function  of  r  given  Ho . 

m+  1  Number  of  hypotheses, 

4>i{k)  P(6i<k  +  l\R(k)). 

We  assume  that  the  measurement  residual  sequence 
{r(k)}  is  conditionally  independent,  i.e.  the  measurement 
residual  sequence  is  independent  once  a  disruption  occurs. 
We  also  assume  that  the  probability  distributions  of  r (k) 
given  Hi  are  known  for  every  i.  In  particular,  all  of  the 
probability  density  functions  /*(*)  are  known. 

We  will  derive  a  recursive  relation  for  Fi(k),  i  = 

0,1,2, 

Note  first  that  there  is  a  simple  relation  between  <j>i(k) 
and  Fi(k)  for  i  >  0: 

Uk)  =  P(0i<k  +  l\K(k))  (1) 

=  P(6i  <  k\K(k))  +  P($i  =  k  +  l|R(fc)>  (2) 
=  P($i  <  k\R(k))  + 

P(0i  =  k  +  1|0*  >  fc,  R(fc))  *  (3) 

P($i  >  k\K(k)) 

=  Fi(k)+pi.(l-Fi(k)).  (4) 

Computing  d>o(k)  is  slightly  more  complicated.  Define  the 
set  of  events  {£*(&)}  so  that  Si(k)  is  the  complement  of 
£i(k)  for  i  —  1, 2, , . . ,  m.  If  we  assume  that  the  events 
{£i(k)}  are  independent  of  each  other,  the  probability  of 
no  transition  before  time  k  A- 1  is  given  by 


0, 1, ...  ,m  in  the  following  manner: 

F  (k  4-1)=  Mk)-fi(r(k  +  l)) 

Proof.  By  induction.  We  begin  by  showing  that 


Fi( 1)  = 


E,=o  4>j( 0)  •  /j(r(l)) ' 


By  Bayes’  Rule, 


*i(l>  =  m  <  l|r(l))  = 


f(r(l),g«<l) 


'  “  '  w/  P(r(l)) 

=  P(r(l)l^  <  1)  •  P{8t  <  1)  (13) 

pm) 

m 

pm) = Y  pm\  <  i)  •  pifij  <  i)  (i4) 

3=0 

4>j{ 0)  =  P(9j  <  l|no  measurements) 

=  P(9j  <  1)  Vj.  (15) 

Also, 

P(y(X)  1%  <  !)  =  /j(r(l))  •  dr(  1)  Vj.  (16) 

Hence, 


m) - PWljj - - 

/,(r(l))-dr(l)-^(0) 

T,T=ofMO)-dr(i)-Mo) 

fimmm 

Ef=o/i(r(l))-^(0) 


•  (17) 


We  next  show  that  if  we  know  {F0(ft), 
Fi(A), . . .  Fm(fc)},  then 


m 

P<(fc  +  1)  = 

4>o(k)  =  l-P(\JSi(k)\-R(k)) 

i— 1 

Yfl 

(5) 

^i(fc)  '  /i(r(fc  + 1))  v.  .  „ 

TZUm-fMk  +  Q)  ’  {  } 

=p(n^(fc)  iR(fc)) 

(6) 

which  is  a  function  of  {F0(A:),Fi(fe), . .  ,Fm(fc)}  via  the 

*=i  1 

m 

relations  (4)  and  (10).  At  stage  k  + 1, 

=  Y[p(ei(k)\mk)) 

(7) 

Fi(k  +  1)  =  P(0i  <k  +  l|R(fc  + 1)) 

i—1 

m 

P(K(k  +  l)|<?i  <  k  + 1)  ■  P{&i  <  k  + 1) 

i= 1 

m 

(8) 

P(R(k  + 1)) 

(19) 

= no  -  pvt  <k+ i)R(fc))} 

i=l 

(9) 

P(R(k  + 1))  =  P(r(fc  +  l)|R(fc))  ■  P(R(fe))  (20) 

ii 

Lh=13 

rS 

H- * 

1 

£ 

(10) 

P(R(fc)|0i  <  k  + 1)  = 

P(0i<k  +  l\K(k))P(R(k)) 
P(6i  <k  + 1) 


Lemma  2.1.  Ffk  +  1)  is  a  junction  of  Fj(k),  j  = 


P(r(k  +  l)\9j  <  fc  +  1)  = 

fj(r(k  +  1))  •  dr(k  +  1)  Vj.  (22) 

We  now  use  the  conditional  independence  of  {r(k)}  to 
write 


Fi(k  +  1)  =  P($i  <k  +  l|R(fc  +  1))  = 

P(R(k  +  1))  '  F^k  +  1),0<  -  k  +  ' 

P(R(fc)|0i  <  fc  +  1)  •  P(6i  <  fc  + 1).  (23) 


Substituting  from  (20)  to  (22)  into  (23),  we  have 


Fi(k  + 1)  = 


P(r(fc  +  l)|R(fc))-P(R(fc))  ' 
fi(r(k  +  1))  •  dr(k  +  1)  • 

P(6i  <k  +  l|R(fc))  •  P(R(fc)) 
P{pi  <  k  +  1) 

P(6i<k  + 1) 

P(r(fc  +  l)|R(fc))  '  ^k  +  ^  ' 
dr(fc  + 1)  •  P($i  <  fc  +  l|R(fc)) 

fii r(fc  +  1))  •  <%(fc  +  1)  •  Mk) 
P(»(fc  +  l)|R(fc)) 


P(r(fc  +  l)|R(fc))  =  £P(r(fc  +  <  fc  +  1) 

j=0 

■  P(6j  <  fc  +  l)R(fc)) 

m 

=  X)/j(r(fc+1))- 

j=0 

dr(fc  4-  1)  •  <£j(fc), 

so  we  can  substitute  into  (26)  to  get 

p  (u  |  i >  /i(r(fc  +  1))  ■  0i(fc) 

%(  j  Er=o/#(fc+i))-^(fc)' 

Our  induction  is  thus  complete. 


2,2  RELATION  TO  A  MULTIPLE  HYPOTHESIS 
WALD  SEQUENTIAL  PROBABILITY  RATIO 
TEST 

If  we  restrict  ourselves  to  the  case  where  one  hypothesis 
is  correct  for  all  time  (i.e.  we  will  never  jump  from 
one  hypothesis  to  another),  we  reduce  to  the  Wald  [12] 
sequential  probability  ratio  test: 


Fi(k  + 1)  = 


Pi(fc)-/i(r(fc  +  l)) 

Ef=o  W  •/;«*  +  !))■ 


Mk)  =  Fi(k)+pi-(l-Fi(k)) 

6>(fc)=imt{i -&(*)}  * 


Table  1:  Summary  of  Shiryayev  sequential  probability  ratio 
test 


This  is  quite  easy  to  show.  Because  there  are  no 
hypothesis  jumps,  pi-0  for  all  i,  so 

Uk)  =  Fi(k)  +pi-(  1  -  P(fc))  =  Fi(k)  (31) 

for  all  i  >  0.  Also,  set  of  events  {£*(&)}  is  now 
mutually  exclusive  since  the  entire  measurement  sequence 
corresponds  to  a  single  hypothesis.  Hence 


(25) 

m 

0o(fc)  =  i-p(Uw|RW) 

i— 1 

(32) 

(26) 

m 

=  l-^P(£i(fc)|R(fc)) 

t=l 

(33) 

m 

=  i-y>(fc) 

(34) 

(27) 

m 

=  i-y>(fc) 

i=l 

(35) 

=  F0(k). 

(36) 

Thus,  the  recursive  expression  from  the  previous 
subsection  becomes 


Fi(k  + 1)  = 


4>i(k)  •  fij r(fc  +  1)) 
T,j=oMk) '  fj(r(k  +  1)) 
Pi(fc)-/i(r(fc  +  l)) 
Er=o*Hfc)-/j(r(fc  +  l))' 


3  AN  IMPROVED  METHOD  FOR  INTEGER  AM- 
BIGUITY  RESOLUTION 

In  this  section  we  propose  applying  the  statistical  tests  of 
the  last  section  to  an  enlarged  residual  that  uses  both  carrier 
phase  and  code  information. 

The  method  we  propose  may  be  used  on  either  single 
or  double  differenced  GPS  pseudoranges.  For  simplicity, 
we  will  derive  the  method  on  double  differenced  data. 
The  conversion  of  the  method  to  single  differenced  data  is 
straightforward. 

Let  us  begin  with  the  linearized  carrier  phase  and  code 


measurement  equations: 

V&$(k)X  =  VH(k)$x(k)  -  AVAN  +  Vtjcar  (k), 

(39) 

V5p(k)  =  VH  (k)6x(k)  +  (40) 

where  V?|car  and  are  independent  zero-mean 

Gaussian  random  sequences  with  variances  VVcan  and 
VV Code»  respectively.  We  can  eliminate  the  terms 
dependent  on  6x  by  subtracting  the  code  measurement 
from  the  carrier  phase  measurement,  yielding  the  following 
relation: 


L  Determine  the  values  of  {VAN*},  i  —  1,  % . . . ,  m 
under  consideration  as  hypotheses.  This  can  be 
done  either  by  taking  a  set  number  of  integers  away 
from  the  code  position  estimate  for  each  satellite , 
or  by  dividing  the  satellites  into  an  independent  and 
dependent  set  as  in  Park  et  al. 

2.  Initialize  the  probabilities  F*(0)  to  their  a  priori 
values  (For  MHWSPRT,  usually  1/m,  where  m  is 
the  number  of  hypotheses  under  consideration.  For 
MHSSPRT,  usually  1  for  the  base  hypothesis  and  0 
for  the  other  hypotheses).  Set  k=0. 


r »(*)  =  VA 4>(k)X  -  V6p(k)  = 

Vrjcar.  (k)  ~  V?7code(fc)  -  AVAN.  (41) 

Note  that  r1  is  an  independent  Gaussian  random  sequence 
with  mean  —  AVAN  and  variance  (Wear.  +  VVcode)* 
Following  the  methodology  of  Park,  et  al  [6,  7], 
we  can  find  an  E (k)  that  is  a  left  annihilator  of  VH(fe). 
Multiplying  the  carrier  phase  measurement  on  the  left  by 
E(fe) ,  we  arrive  at 

r2(fc)  4  E(fc)VA^(fc)A  = 

E(fc)' VVcar.  (fc)  -  AE(fc) V AN.  (42) 

Then  r2(fc)  is  an  independent 

Gaussian  random  sequence  with  mean 
-AE(&)VAN  and  variance  E(£)VVcar.(&)ET(A;). 
Construct  the  vector  r(fc)  as  follows: 


r(fc)  = 


[r nm  _ 

VA0(fc)A  -  VSp(k) 

E(fc)VA<£(fc)A 

Vrjcarik)  -  Vricode(k)  -  AVAN 
ECfcjVrj^Xfc)  -  AE(fc)VAN 


•  (43) 


Then  r(k )  is  an  independent  Gaussian  random  sequence 
with  mean  mr(VAN,  k)  and  variance  Vr(k)  given  by 


mr(VAN,  k) 


-AVAN 
-AE(fc)VANj  ’ 


Vr(fc)  = 


VVvde(k)  +  VVcar.(k) 

nk)Wcar.(k) 


VVcar.(k)ET(k) 

E(fc)VVear.(fc)ET(fc) 


(44) 


(45) 


Our  proposed  algorithm  for  integer  ambiguity  resolu¬ 
tion  is  simply  to  apply  MHWSPRT  or  MHSSPRT  to  r (fc), 
with  the  hypothesis  set  •  ■  •  Wm}  containing  all 

of  the  values  of  VAN  that  are  under  consideration.  We 
outline  it  in  detail  below. 


Algorithm  3.1. 


3.  Take  the  ( k  +  1  )th  measurements  VA <f>(k  +  1)  and 
V5p(k  + 1). 

4.  Evaluate  /,(r(fc  + 1  ))fori  =  1,2 . mas  follows: 

fi(r(k  + 1))  =  exp{rj(fc  +  l)TVr(fc  +  l)ri(A:  +  1)}, 

where 

and 

r\{k  +  l)  = 

VA  4>(k  +  1)A  -  VSp(k  +  1)  +  AVAN,, 


ri(k  + 1)  = 


r}(k  +  l) 
rf(fc  +  l) 


Ti(k  + 1)  = 

E(fc  + 1)  VA<£(fc  + 1)  A  +  AE(A:  + 1)  V  AN^fc  + 1) . 

Note  that  since  all  the  hypotheses  under  consideration 
have  identical  covariances ,  the  constant  term  preced¬ 
ing  the  exponent  has  been  eliminated  in  the  above 
expression . 

5.  Calculate  {Fi(k  +  1)}  using  {F*(fc)}  and  {fi(r(k  4- 
1)}  with  either  MHWSPRT  or  MHSSPRT,  depending 
on  whether  we  are  determining  the  initial  ambiguity 
or  monitoring  for  cycle  slips . 

6.  If  we  reach  a  desired  threshold  with  any  of  the  {F*  (k  -f 
1)},  declare  the  initial  integer  ambiguity  and  begin 
monitoring  for  cycle  slips  (MHWSPRT)  or  declare  a 
cycle  slip  and  reset  the  base  hypothesis  (MHSSPRT). 

7.  Go  to  step  3. 

For  pedagogical  reasons,  we  have  used  conventional 
LI  code  pseudoranges  in  constructing  the  residual  r  1(k). 
It  is  better  in  practice  to  use  narrowlane  code  pseudorange 
combinations  instead,  because  the  combination  of  widelane 
carrier  pseudoranges  and  narrowlane  code  pseudoranges 
yields  a  residual  that  contains  no  errors  from  ionospheric 
delay  [13]. 


4  RESOLVING  INTEGER  AMBIGUITIES  SEPA¬ 
RATELY 

A  key  problem  with  the  integer  ambiguity  resolution 
scheme  we  have  presented,  as  well  as  with  algorithms  of 
the  type  proposed  by  Park  et  al  ,  is  that  the  number  of 
hypotheses  that  must  be  considered  is  large,  as  a  result 
of  the  combinatorial  relationship  between  the  number  of 
satellites  and  the  number  of  integers  to  be  examined  per 
satellite.  We  propose  a  technique  to  alleviate  this  problem 
below. 

Our  approach  attempts  to  construct  residuals  such  that 
each  residual  is  only  affected  by  the  integer  ambiguity 
of  a  single  satellite.  Then  the  integer  ambiguities  of  the 
satellites  may  be  determined  in  parallel,  with  each  parallel 
element  making  a  choice  from  a  small  number  of  possible 
hypotheses. 

Consider  the  carrier  phase  and  code  measurement 
equations  again: 


V&4>(k)X 

VSp(k) 


VH  (k)  -XL 
VH  (k)  0 


fc)i 

VAN 

^ Wear .  (fy 


Suppose  that  we  are  only  interested  in  the  jth  integer 
ambiguity  VAN^  .  If  we  exclude  measurement  equations 
with  any  of  the  other  integer  ambiguities,  the  equation 
above  reduces  to 


VA4>^(k)X]  rVhW(fc)' 
V5p(k)  j  VH(fc)  _ 


VSx(k)  - 


1  VA/V^A  4-  ^Vcar .  (k) 

PJ  NcoaeWj  ' 


(47) 


Denote  by  (k)  the  left  annihilator  of  the  matrix 

'Vh  W(fc)' 

VH(fc)  ' 


5  A  RESIDUAL  FOR  RESOLVING  THE  INTE¬ 
GERS  OF  NEWLY  ACQUIRED  SATELLITES 

If  the  integers  corresponding  to  a  number  of  satellites 
have  been  resolved  and  a  new  satellite  comes  into  view, 
resolving  the  integer  ambiguity  of  the  new  satellite  is 
especially  easy.  Let  VAN  6  be  the  vector  of 

resolved  integer  ambiguities  corresponding  to  the  carrier 
phase  measurement  vector  VAtp(k)  e  The 

carrier  phase  measurement  corresponding  to  the  new 
satellite  is  V  A<p^M^(k),  and  the  integer  ambiguity  we  seek 
to  resolve  is  VAN(m\  The  measurement  equations  can 
then  be  written  as 


VA  #M>(fc) 

(VA  4>{k)  +  VAN)  A 

-1' 
0 


h<M>(fc)' 
H  (k)  _ 


VSx(k)  + 


VA  JV(M)  + 


(49) 


Construct  a  measurement  residual  r(M^(fc)  by  multiplying 
(49)  by  (k),  the  left  annihilator  of  the  matrix 

'h  ("){*)* 

H(fe)  • 


>(fc)  =  E(M>(k) 


VA<f>M(k) 

(VA  <j>(k)  +  VAN)  A 


r  (M}-|  r-,-] 

E(m)(^)  Vcar-  +E(M)(k)  1  VAN{Ml  (50) 

7?ear.J  L  U  . 


The  measurement  residual  v^M\k)  is  a  random  noise 
sequence,  with  distribution  determined  by  VAN^M^ 
and  the  joint  distribution  of  and  Vcar  -  Taking 
advantage  of  the  small  number  of  possible  hypotheses 
under  consideration  and  the  low  noise  associated  with 
r^M)(fc),  a  MHWSPRT  can  quickly  determine  the  correct 
value  of  the  new  integer  ambiguity  VA N^M\ 


Multiplying  (47)  by  (fc)  on  the  left  yields  6  EXPERIMENTS 


4#(fc) 


VA^')(fc)A] 

V5p(k) 


E  W>(fc) 


Vr$r.(k) 

V*?e ode(^) 

E W)(fc)  f 


VAN^X.  (48) 


Hence,  r^(k)  is  a  noise  process,  with  distribution 
determined  by  the  integer  ambiguity  VAN^  and  the 
joint  distribution  of  7]car  and  The  residual  r^(k) 

can  thus  be  tested  with  a  MHSPRT  to  determine  which 
hypothesis  is  the  correct  value  for  VAN^\  independent 
of  the  other  integer  ambiguities. 


In  this  section,  we  evaluate  the  performance  of  the  methods 
derived  in  the  previous  sections. 

6,1  STATIONARY  SINGLE  ANTENNA  EXPERI¬ 
MENT 

We  first  constructed  an  experimental  apparatus  in  which  the 
integer  ambiguity  was  known.  We  connected  two  Ashtech 
model  Z-12  GPS  receivers  to  a  single  Sensor  Systems 
model  S67- 1575-96  L1/L2  active  antenna,  so  that  the 
integer  bias  was  known  to  be  zero  for  every  carrier  phase 
measurement.  We  then  compared  the  results  of  a  Wald 
test  using  the  residual  (43)  to  those  using  the  carrier-only 
residual  (42). 


The  data  sequences  we  measured  contained  observa¬ 
tions  of  at  least  seven  satellites.  To  test  the  algorithm, 
we  eliminated  some  of  the  measurements,  so  that  there 
were  either  five  or  six  visible  satellites  visible.  These 
reduced  data  sets  were  double  differenced  and  widelaned, 
and  then  processed  by  the  integer  ambiguity  resolution 
algorithm.  With  either  residual,  the  algorithm  always 
correctly  concluded  that  the  integer  biases  were  zero.  We 
have  plotted  the  time  history  of  the  maximum  valued  FiS 
for  both  the  five  and  six  satellite  cases  in  Figures  1  and  2. 
Note  that  while  the  addition  of  GPS  code  measurements 
only  slightly  improves  the  convergence  in  the  six  satellite 
case,  the  improvement  in  the  five  satellite  case  is  quite 
significant. 


Cowergarca  comparison  — on  slnrfs  antenna  data,  S  tats.  visible 


•ample  number 


Figure  1:  Comparison  between  different  residuals:  Maxi¬ 
mum  value  of  Fi  vs.  time,  5  satellites  visible 


Cowerpence  comparison  —  on  tlnoi*  •ntenne  dele,  6  t*is.  irisJbi* 


•ample  number 

Figure  2:  Comparison  between  different  residuals:  Maxi¬ 
mum  value  of  Fi  vs.  time,  6  satellites  visible 

Finally,  we  looked  at  the  performance  of  the  Wald 


test  applied  to  the  error  residual  in  (41).  Using  the  same 
original  measurement  sequence,  we  reduced  the  number  of 
visible  satellites  in  the  data  to  four  before  processing  them. 
In  this  case,  the  correct  hypothesis  was  not  determined  until 
after  75  measurements,  due  to  a  large  excursion  that  one  of 
the  code  measurements  took  from  its  mean.  In  Figure  3 
we  plotted  the  time  history  of  the  most  probable  hypothesis 
(the  correct  hypothesis  number  in  this  case  was  14).  In 
Figure  4  we  plotted  the  time  history  of  the  maximum  valued 
Fi . 

Note  that  when  our  algorithm  assumed  that  the 
standard  deviation  of  the  code  measurements  was  1  meter, 
the  Wald  test  told  us  that  it  was  100%  sure  that  an  incorrect 
hypothesis  was  correct!  To  avoid  this  overconfidence 
problem  in  our  algorithm,  we  increased  the  standard 
deviation  of  the  code  measurements  to  2  meters.  While  this 
did  not  decrease  the  time  at  which  the  correct  hypothesis 
was  declared  the  most  probable,  it  did  avoid  the  problem  of 
declaring  the  wrong  hypothesis  correct. 


Figure  3:  Most  probable  hypothesis  vs.  time,  4  satellites 
visible.  The  correct  hypothesis  number  is  14 


6.2  DYNAMIC  TESTS 

In  this  subsection,  several  different  techniques  for  resolving 
integer  ambiguity  were  applied  to  the  same  data  set. 
The  GPS  data  was  collected  using  a  test  rig  in  which 
two  Sensor  Systems  model  S67-1575-96  L1/L2  active 
GPS  antennae  were  placed  a  fixed  distance  from  each 
other  (2.3  meters).  The  test  rig  was  mounted  to  a  car, 
which  was  driven  at  moderate  speed  for  several  minutes 
while  the  GPS  measurements  were  recorded  from  Ashtech 
model  Z-12  GPS  receivers.  The  results  of  the  integer 
ambiguity  resolution  schemes  could  be  readily  verified, 
as  filtering  of  the  code  measurements  using  the  resolved 
integers  generated  a  distance  estimate  between  the  two 
antennae,  which  would  not  match  the  true  distance  unless 


Figure  4:  Maximum  value  of  Fi  vs,  time*  4  satellites  visible 

the  resolved  integers  were  correct.  The  experimental 
data  was  collected  beginning  Friday,  October  13,  2000  at 
19:22:33,5  PST. 

6.2.1  WALD  TEST  USING  MULTIPLE  CARRIER 
MEASUREMENTS  AND  MULTIPLE  CODE 
MEASUREMENTS 

A  Wald  test  using  a  residual  generated  by  5  double 
differenced  widelane  carrier  measurements  and  5  dou¬ 
ble  differenced  narrowlane  code  measurements  quickly 
resolved  the  correct  integer  ambiguities.  There  were 
five  integer  values  under  consideration  for  each  double 
differenced  carrier  measurement*  for  a  total  of  5s  = 
3125  different  hypotheses.  The  noise  in  each  of  the 
single  differenced  widelane  carrier  phase  measurements 
was  assumed  to  be  a  zero  mean  Gaussian  white  noise 
process  with  a  2^/2  cm,  standard  deviation.  The  noise  in 
each  single  differenced  widelane  code  measurement  was 
assumed  to  be  a  zero  mean  Gaussian  white  noise  process 
with  a  s/2  m.  standard  deviation.  Figure  5  shows  the  time 
history  of  the  probability  of  the  most  probable  hypothesis. 

6.2.2  WALD  TEST  USING  A  SINGLE  CARRIER 
MEASUREMENT  AND  MULTIPLE  CODE 
MEASUREMENTS 

When  a  Wald  test  was  performed  using  the  residual  (48), 
the  correct  integer  ambiguities  were  again  resolved,  albeit 
more  slowly.  The  residual  for  each  integer  ambiguity 
used  a  single  double  differenced  widelane  carrier  phase 
measurement  and  as  many  double  differenced  narrowlane 
code  measurements  as  were  available  at  each  epoch 
(between  5  and  7),  There  were  9  integer  values  under 
consideration  for  each  satellite.  The  noise  in  the  double 
differenced  carrier  measurement  was  assumed  to  be  a  zero 


Figure  5:  Maximum  value  of  Fi  vs.  time,  residual  uses  5 
double  differenced  carrier  and  5  double  differenced  code 
measurements 


mean  Gaussian  white  noise  process  with  standard  deviation 
2\/2  cm.  The  noise  in  each  double  differenced  code 
measurement  was  assumed  to  be  a  zero  mean  Gaussian 
white  noise  process  with  standard  deviation  2  m.  Figure 
6  shows  the  time  history  of  the  probability  of  the  most 
probable  hypothesis  for  one  integer  ambiguity*  a  pattern 
that  was  typical  for  all  the  ambiguities  for  which  we 
searched.  The  price  of  the  convenience,  simplicity  and 
low  computational  cost  associated  with  evaluating  the 
ambiguities  separately  was  slower  convergence  of  the  Wald 
test. 


Figure  6:  Maximum  value  of  F*  vs.  time,  evaluating  integer 
ambiguities  separately 


6.23  SHIRYAYEV  TEST  USING  MULTIPLE  CAR¬ 
RIER  MEASUREMENTS  AND  MULTIPLE 
CODE  MEASUREMENTS 

Since  there  were  no  cycle  slips  in  the  observed  data,  we 
artificially  introduced  one  into  the  measurement  sequence 
in  order  to  test  our  cycle  slip  monitoring  scheme.  We 
injected  the  cycle  slip  into  one  of  the  carrier  phase 
measurements  at  the  100th  epoch.  We  then  ran  a  Shiryayev 
test  on  the  same  residual  as  before,  using  the  integers 
resolved  by  the  Wald  test  as  the  nominal  hypothesis.  The 
assumed  standard  deviation  of  the  carrier  measurements 
was  increased  to  8  cm,,  as  lower  values  made  the  test 
too  eager  to  declare  a  cycle  slip.  Figure  7  shows  the 
time  history  of  the  most  probable  hypothesis.  The  correct 
hypothesis  number  is  313  before  100  samples,  and  it  is  312 
after  100  samples.  Figure  8  plots  the  time  history  of  the 
probability  of  the  most  probable  hypothesis.  The  Shiryayev 
test  detected  that  a  cycle  slip  had  occurred  immediately,  but 
it  took  30  samples  until  it  identified  the  correct  hypothesis 
as  the  most  probable  one,  and  an  additional  20  samples 
until  the  probability  of  that  hypothesis  was  near  100%. 
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Figure  7:  Most  probable  hypothesis  vs.  time,  correct  value 
is  313  before  100  samples  and  312  afterwards 


7  CONCLUSION 

This  paper  outlines  the  integer  ambiguity  problem  for 
GPS  and  describes  some  new  methods  for  resolving  the 
integer  ambiguities  and  for  detecting  cycle  slips.  The 
main  contribution  is  the  application  of  the  Wald  and 
Shiryayev  multiple  hypothesis  sequential  probability  ratio 
tests  (MHSPRTs)  to  dynamically  compute  the  conditional 
probabilities  of  members  of  a  set  of  integer  hypotheses  be¬ 
ing  correct.  Since  the  MHSPRTs  require  evaluations  of  the 
probability  density  functions  of  the  measurement  residuals, 
the  expressions  we  constructed  for  the  probability  density 
functions  of  several  residuals  are  also  of  interest,  A  large 


Figure  8:  Maximum  value  of  Fi  vs.  time,  cycle  slip  occurs 
after  100  samples 

number  of  experiments,  performed  on  both  simulations  and 
actual  GPS  measurements,  demonstrates  the  effectiveness 
of  our  methods. 

Although  there  are  many  other  methods  for  identify¬ 
ing  the  initial  integer  biases  and  for  detecting  cycle  slips, 
the  methods  presented  here  possess  several  advantages 
over  their  peers.  Chief  among  these  advantages  is  the 
presentation  of  information  about  the  integer  hypotheses  in 
a  probabilistic  framework,  instead  of  the  cumulative  sum 
approach  used  by  competing  methods.  This  probabilistic 
framework  allows  for  easy  accommodation  of  events  such 
as  the  introduction  of  new  hypotheses.  Other  advantages 
of  our  methods  are  cycle  slip  detection  that  automatically 
determines  the  new  integer  hypothesis  after  the  slip  (rather 
than  simply  announcing  that  a  cycle  slip  has  occurred), 
provisions  for  the  easy  accommodation  of  non-Gaussian 
measurement  noises,  and  efficient  computation  due  to  the 
recursive  nature  of  the  MHSPRTs. 

The  most  significant  disadvantage  of  our  method  is 
that  only  a  finite  number  of  hypotheses  may  be  considered. 
Further,  the  computational  cost  increases  as  the  number 
of  hypotheses  increases.  We  have  presented  a  residual 
that  drastically  limits  the  number  of  hypotheses  under 
consideration,  but  the  price  is  an  increase  in  the  noise  of 
the  residual.  This  provides  a  GPS  receiver  designer  with 
the  option  of  trading  between  the  computational  load  for 
each  epoch  versus  the  number  of  epochs  of  data  required  to 
accurately  determine  the  integer  ambiguities. 

In  closing,  note  that  our  methodology  need  not 
be  exclusive  of  other  techniques  for  integer  ambiguity 
resolution.  For  instance,  a  small  set  of  admissible 
integer  ambiguity  hypotheses  can  obtained  via  Teunissen’s 
LAMBDA  method  [1,  2],  These  hypotheses  could  then  be 
analyzed  using  our  techniques.  In  making  this  set  small, 
the  LAMBDA  method  greatly  reduces  the  computational 
time  required  by  our  techniques. 
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